Some Notes on Readability
Check out Zettlr's readability implementation on our gorgeous demonstration page where we also explain the algorithms for those of you who are interested!
Writing is a complex task. Apart from an initial idea, you'll need to constantly proofread what you've written, maintain a central thread and make sure your message gets through to your readers. Today's digital possibilities offer a rich environment that can help you achieve great writing experiences. Nevertheless, there's still a lot to do. While some editors such as Hemingway go some way to indicate complex words or grammatically difficult phrases, these are still mere first steps towards solutions that help you write great texts. Hemingway, for instance, only works with English, complex German words aren't highlighted for instance. (And believe me, there are a lot—just take "unumgänglich" or "nichtsdestotrotz"). Nevertheless, the approach of offering tools to measure your writing is promising. Focusing on the language itself is the next logical step in the development of writing tools. While still in its early stages, it is a development that will become much more important in the years to come.
What is Readability?
Readability measures the comprehension of a text. If a text is easy to understand, its readability is good. If it's not, the readability should be rather bad. In accordance with the philosophy of Zettlr to become the best tool for writing out there, it offers a readability mode beginning at version 1.4. You can switch it on and off with the push of a button, and the mode will highlight all your sentences, grading their readability according to one of four different algorithms (all with their own strengths and weaknesses). The algorithms are taken directly from linguists who have developed certain measurables to determine the quality of text. Nevertheless, we have adapted the algorithms a little bit to accommodate for the different context in which they will be deployed.
Readability and Zettlr
First, we have heightened the scores. Most scores that are produced by the algorithms refer to years of formal education in the United States school system. So defusing the algorithms a little bit provides scores that are more reasonable when writing for a target audience with college grade or University degree education. Secondly, many algorithms depend upon dictionaries of complex words. These are words that have been estimated by researchers to be difficult to read. These will increase the score received by a given text, obviously. But here we return to the fact that the algorithms were without exception developed with English in mind, thereby tilted strongly towards the English language.
Zettlr tries to be as much language-agnostic as it can get, and therefore had to adapt the algorithms further. We cannot rely on a dictionary with complex words as we cannot anticipate the language the user writes in. After some thoughts, we came up with a solution, hidden within a paper from 1975 by Coleman and Liau: "There is no need to estimate syllables since word length in letters is a better predictor of readability than word length in syllables." According to this finding, the amount of characters in a given word is a better measure for determining the difficulty or complexity than syllables. This enables us to define the complex words approximately by counting the amount of characters.
Due to the fact that languages clearly have different word lengths, what we then did was to determine the complex words of any language by comparing the word lengths. After some testing, we indeed found out that the algorithms, prepared this way, produced great results in many languages. One problem, though, remains: these algorithms still won't work for languages in which characters resemble syllables and whole words, such as Japanese or Chinese. There, these algorithms will come to a grinding stop, which clearly shows that, albeit we've detached the algorithms from the English language, we still have a bias towards languages that work a certain way: by composing words and meaning by chaining generic letters together.
Final Remarks
One should note, though, that this "readability"-approach is only an indicator of the text quality. It cannot provide you with the definite guide to writing the perfect text. Albeit the algorithms provide good results, the aim should not be for you to write in a way that the scores for all sentences are in the green area. You will certainly need one or two "red" sentences every other paragraph, as the readability algorithms do not account for whole paragraphs and texts also need some difficult to read words to be "crunchy." If a text only uses simple language, readers will easily feel patronised, and especially academic audiences will not take it easy if you throw marketable texts at them. Therefore, it's advisable to write your text using the standard mode, and only switch to the readability mode to double-check if some paragraphs are too red.
If you would like to see how the readability mode by Zettlr works, please have a look at our new readability demonstration page, where you can see how the four algorithms grade "Bartleby, the Scrivener" by Herman Melville. Of course, you can also remove the text and check other texts as well.
Do you like our work? Then please consider becoming a Patron or donating via PayPal to keep this project running!