UI Design Newsletter – July, 2002

In This Issue

Readability Formulas

Bob Bailey, Ph.D.,
Chief Scientist for HFI, discusses the value of readability formulas in writing for the Web.

The Pragmatic Ergonomist

Dr. Eric Schaffer, Ph.D., CUA, CPE, Founder and CEO of HFI offers practical advice.

The Bollywood Technique

Dr. Schaffer
presents an innovative approach to usability testing with subjects who don't like to criticize.

Readability Formulas


There is a considerable amount of information published on the Web that is intended to be read by someone. There is evidence that much of the information may be too hard to read and understand for typical readers.

Baker, Wilson and Kars (1997) reported that the readability scores of most articles in the 'Health Reference Center' ranged from 10th to 14th grade levels. Another study (Graber, Roller and Kaeble, 1999) included text-based information from commercial, academic and government sites. They found that the reading material averaged the 10th grade level. In a more recent study, a group of researchers (D'Alessandro, et.al., 2001) conducted readability analyses of pediatric patient education materials on the Web, and concluded that the information was not written at an appropriate reading level for typical users.

Readability Formulas

Readability formulas have been developed to assist writers in preparing information. These formulas provide a means for estimating the difficulty a reader may have reading and understanding a paragraph, section or entire document on the Web.

The first readability formula was developed over 80 years ago, and a number of formulas have been developed since that time. These formulas originally were designed to help classroom teachers choose textbooks for their students. Currently available, computer-based readability formulas include:

  • Automated Readability Index
  • Dale-Chall
  • Flesch Reading Ease (included with Microsoft Word)
  • Flesch-Kincaid Grade Level (included with Microsoft Word)
  • FOG
  • SMOG (Simple Measure of Gobbledygook)

Readability results will vary depending on which formula is used. For example, the Flesch-Kincaid tool often returns a score two to three grades lower than other formulas. Osborne (2000) proposed that grade level equivalent scores tend to be accurate only by plus or minus 1.5 grade levels.

There are numerous factors that affect how easy, or how hard, a given document is to read and understand, including sentence length, word choice, layout and formatting, overall organization of the content, use of illustrations, etc. However, most readability formulas consider only two factors:
(a) the number of syllables (or letters) in a word, and
(b) the number of words in a sentence.

Because most readability formulas consider only these two factors, these formulas do not reveal why written material is difficult to read and comprehend. Most of the important attributes of writing that contribute to reading difficulty have not yet been quantified. Fortunately, many of the difficult-to-measure attributes are highly correlated with the two factors that can be easily measured.

Readability formulas are most useful as predictors of reading difficulty. Klare (1975), in a review of readability formulas, concluded that "as long as predictions are all that is needed, the evidence that simple word and sentence counts can provide satisfactory predictions for most purposes is now quite conclusive."

A document classified as highly readable solely on the basis of a readability formula could be a disorganized disaster—or contain no content at all. The following paragraph has a calculated readability of the 12th grade:

Qwerty uiopas dfg hjkl zxcvb nmqw ertyuio pas dfghj klzxcvb nmq werty ui opas dfgh jklzxc vbnm. Qwertyuiop as dfgh jklz xcvbn mqwe rtyui opas dfghjk lzx cv bn m. Qw ertyu iopas dfghj klzxcvb nmqwert yuiopasdf ghjk lzxcv b nmqw ert yuiop asdf gh jk lzxcvbn m. Qwerty uiop asdfg hjklz xcvbn mqwe rtyuiop asdfgh jklzxcv bnmq wert yui opa sdfgh jklzxc vbnm qwerty uiopas dfghj klzx cvbnm.

Obviously, readability scores depend on the writing style rather than the content of written material. These stylistic features are under the control of the writer.

Reading Skill of the Intended Audience

As general rule, it is better to write a document at a readability level that is below the reading skill level of the intended audience. Ideally, the reading skill level of intended readers would be based on the results of a standardized reading test (e.g., the Nelson-Denny Reading Test). This is usually reported as a grade level, i.e., "95% of the users in the target audience read at an 8th grade level or higher."

Usually it is not practical to administer a reading test to all potential users. An estimate of the reading grade level of the intended audience can be obtained by considering the users' education level. An average eighth grader is assumed to read at an 8th grade reading level, and a twelfth grader at a 12th grade level. People who have completed college are assumed to read at the 16th grade level.

In general, people with more education have better reading skills than people with less education. However, the actual reading ability of a person does not always match his or her educational level. Coke and Koether (1979) collected reading scores for over 200 company employees. The group averaged a 12th grade education, and 95% had reading test scores above the 10th grade reading level. Hilts and Krilyk (1991) reported that adults read at least one or two grade levels below their last school grade completed.

Summarizing several studies done in the United States and Canada, the average reading skill level was estimated to be at around the 8th to 9th grade (University of Utah Health Sciences Center). However, this same study found that about one in five adults had a reading skill level at the 5th grade level or below.


By comparing the calculated readability of a document to the reading skill level of typical users, a writer can estimate whether a document has a good chance of being read and understood. The readability formula can be used as a predictor of difficulty, but should not be used as a diagnostic tool. Readability formulas do not provide information about how to make instructions more comprehensible. For example, a document with a high readability level might be made more readable by changing its format rather than its writing style.

To make written texts truly readable, Website designers should apply all the principles of clear and simple writing. Even though using short words and short sentences will result in lower readability scores, this does not guarantee that a document will be easier to read.

Finally, there are times when document readability issues are not as important as other issues. Klare (1975) found that in circumstances where time is not crucial and readers are highly motivated, the readability of a document was of less importance. Coke (1976) provided evidence that readability was not as important when readers were looking for specific information as it was when users had to remember that information.

Incidentally, the Flesch-Kincaid readability level for this article is 12th grade.


Re: The Bollywood Technique – Fascinating. I wonder if this technique could be used successfully in other areas, both outside the Asia theater, but also in other environments.

In the late 1970's I was involved in an examination of maritime safety, specifically ship collisions, rammings and groundings. Preliminary results were obtained by reviewing accident reports and investigations. Later we observed and "dialogued" with Masters and Pilots. While we used "non attribution" to great effect, and follow-on studies used simulations similar to flight simulators, this technique may have also produced excellent results, at a lower overall cost. Something to ponder.


Daniel Jones
US Army

The Pragmatic Ergonomist, Dr. Eric Schaffer

Good writing is important. Use short words and short sentences. This will give a low reading grade level (RGL), which is good. It is also important to use common words. Get a word frequency dictionary. This lists how often each word is used from a large sample of materials (e.g., in High School English texts). So "Amend" may be shorter then "Change". But use "Change" because it is more common. It is handy to know how to calculate RGL. I've done that in many meetings and it is a great way to flag poor writing. Microsoft Word will give a RGL. It is in the "spelling and grammar" tool. I would NOT increase the reading grade level to match your user population. A given population may allow a higher reading grade level. It also may allow use of certain jargon. But even people with lots of Doctorates like simple writing.

This was written at 5th grade level. Is it OK?

 The Bollywood Technique

Apala Chavan is the managing director of our office in Mumbai India. She presented her fascinating new testing method at the CHI convention this year. She called it "The Bollywood Technique" and I'd like to share it because I think we can all benefit.

What is the main challenge when you are usability testing in Asia?

In Asia it is impolite to tell someone they have a bad design. It is embarrassing to admit that you cannot find something. So it is very hard to get feedback.

Apala tested a site that offered railroad tickets for sale. She used the conventional simulation method and got little feedback. She could see that users were not succeeding. But they would not willingly discus the problems.

Apala then tried the Bollywood method. Now Bollywood is the Hollywood of India. They make more movies than Hollywood. They are famous for movies that have long and emotionally involved plots. The movies have great pathos and excitement. In the Bollywood method Apala described a dire fantasy situation. The participant’s beautiful, young, and innocent niece is about to be married. But suddenly he gets news that the prospective groom is a member of the underground. He is a hit man! His whole life story is a sham, AND HE IS ALREADY MARRIED! The participant has the evidence and must book an airline ticket for himself and the groom's current wife to Bangalore. Time is of the essence!!!

The participants willingly entered this fantasy and with great excitement began the ticket booking process. Even minor difficulties they encountered resulted in immediate and incisive commentary. The participants complained about the button naming and placement. They pointed out the number of extra steps in booking. The fantasy situation gave them license to communicate in a way that they never would under normal evaluation methods.

I think this is a great method for the Asian markets. But I also expect we might be able to generalize it to special situations in North America and other places where participants may be hesitant to communicate freely.

