Site MapWe help make companies user-centric ![]() Usability Test Reporting: "It Ain't Over 'Til It's Over"
|
|||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Our commercial culture has amazing twists. At Amazon.com, for a mere one cent you can purchase a book by Yogi Berra, the baseball catcher who made good with "Yogiisms" like "It ain't over 'til it's over." This popular philosophical quote also happens to be the title of Berra's autobiographical 1989 book covering his tectonic ups and downs in major league baseball.
So what's the amazing twist on a one cent book? Well, until I wrote this article, I used to believe the twist was the $3.99 shipping charge that allowed the vendor to make a few cents from even cheaper shipping. However, looking at the Used Book listing for Yogi's title, I just saw that effectively, I can purchase his book for a "negative $.96"! In this latter case, the Amazon "Prime" price is only $3.03 – meaning the vendor gave up some of their $3.99 shipping allowance to subsidize my purchase to the tune of 96 cents! Now that's an even more amazing twist. So, "it ain't over 'til it's over" when selling books.
What about your usability test recommendations? So – you might think that when you write up the problems, make your design recommendations and turn them in, you've done your job. Right? Wrong. You just forgot: "It ain't over 'til it's over." |
|||||||||||||||
The truth about truth |
I'm sure you've had occasion to tell colleagues and managers that the source of usability problems in many cases is "ego-centric design." HFI publicizes this truth with a famous button passed out at courses: "Know Thy Users, For They Are Not You." But, we might ask the same question about whether we, world-wise and experienced designers, have avoided "ego-centric re-design" when recommending solutions to usability problems. We report the "truth" about an application's usability faults. But how well do we frame our re-design recommendations so they are truly usable for our readers? How truthfully can others interpret our recommendations? This is the question asked by a trio of usability authors, each well-known in their own right: Rolf Molich, Robin Jeffries and Joe Dumas. Their 2007 study is titled "Making Usability Recommendations Useful and Usable." They examined how well 17 teams of usability professionals wrote up usability evaluation reports of a hotel reservation system in use by hundreds of hotels. Each team had 1 to 5 members. Each team had an average of 1.6 persons and 5 to 40 years combined usability experience within each team. Not a bad collection of skills. However, our trio of researchers found that 17% of the redesign recommendations from the teams "were not useful at all." 19% of the redesign recommendations "were not usable at all." And only 17% of the redesign recommendations "were both useful and usable." The authors report: "Quality problems include recommendations that are vague or not actionable, and ones that may not improve the overall usability of the application." What went wrong? How many of us thought that the hard work was the evaluation, not the write-up? Is writing usability evaluations a risky endeavor? What do you think, now? Let's see why "it ain't over 'til it's over," next. |
||||||||||||||
Evaluating usability recommendations |
We'll jump to the crux of the study: How did the authors evaluate the usability recommendations written by those 17 teams? (We'll get to other details later.) 1. The authors found 81 usability problems that were identified by at least 10 out of the 17 evaluation teams. This "consensus" among teams allowed the authors freedom from defending the definition of "usability problem." 2. The authors developed a 5-point scale for usefulness and usability. They independently rated a "training set" of usability recommendations and then compared their ratings. They ended their training period after reaching agreement on 89% of their trial evaluations. 3. Their evaluation scales follow. Instructions to the participants stated that recommendations should be short. I give you the essence of the authors' comments. |
||||||||||||||
The gold standard for useful and usable recommendations (give it a "5"!) |
Would you agree on these definitions of "useful" and "usable"? It's not unlike your Literature teacher giving you a double grade: an F for Thoughtfulness (like "usefulness") and an A for Composition (like "usability"). This means your teacher hated the ideas, but loved your spelling, punctuation, and sentence construction! |
||||||||||||||
The other ratings |
I think we get the idea: "3" is the middle score: half good and half, well, just plain bad. Gee. Guess we better get a better score than 3 or else get another career. Clearly, a 3 is a make or break score!
|
||||||||||||||
A final contrast of scores |
Let's see some examples of tough love from our trio of authors...
Whew. If you can't stand the heat, get out of the kitchen, right? |
||||||||||||||
Be explicit – say what you mean |
The authors found that teams took seriously the challenge of offering recommendations. Teams offered suggestions 96% of the time across their 81 problems. However, the authors found that 16% of the time teams failed to use the word "Recommendation" or equivalent. Thus, their changes failed to capture attention. As the authors comment: "Implicit recommendations often sound like complaints or unprocessed observations of test participant difficulties..." Recommendation: use the word "Recommendation" when making a recommendation (!) (Explicate explicitly!) |
||||||||||||||
The humbling details |
Our authors summarized their findings by defining "high-quality" recommendations as 4.0 and above for both usefulness and usability. What percentage of the 81 recommendations would you guess met that modest criterion? Well, only 17% of the 81 recommendations (14 of them) could be called both "Useful" and "Usable" or better. The authors point out more pain among the teams. They point out that only 42% of the 81 recommendations (34) could be called both "Partly Useful" and "Partly Usable" (3.0 or better). Can we conclude that it's tough to write right? Well, that's the point of this study. Yes, it is tough to write good recommendations. |
||||||||||||||
It ain't over 'til it's over: how to write right |
Hopefully, you've learned that testing is not the hardest part. It's the quality of your recommendations that tips the scales. Are you motivated, now? Here's advice from our three authors. 1. Check your work for vague-uity. 2. Avoid solutions that create other problems elsewhere. 3. Beware of business or technical constraints. 4. Be sure to test sweeping changes. 5. Be specific. Be clear. This makes your writing "usable." |
||||||||||||||
Where Yogi Berra said his thing... |
The Yogiism, "It ain't over 'til it's over," was conceived, born, and delivered on July 1973. Yogi Berra's Mets trailed the Chicago Cubs by 9/12 games in the National League East. With Berra's inspiration, the Mets rallied to win the division title on the final day of the season. |
||||||||||||||
References |
Molich, Rolf; Jeffries, Robin; Dumas, Joseph. 2007. Making Usability Recommendations Useful and Usable. Journal of Usability Studies, 2 (4), 162-179. Yogi Berra. Downloaded 24 Aug, 2010 from Wikipedia. |
||||||||||||||
Comments (3)
Reader comments on this and other articles.
|
|||||||||||||||
![]() Message from the CEO, Dr. Eric Schaffer
|
|||||||||||||||
![]() |
John is so right. It's NOT enough to have the correct technical understanding of a problem. We must communicate. If we are speaking to executives we need to leave out the technical details and make the business case. If we are talking to developers we need SIMPLE explanations (NOT "Chromosteriopsis" – Say "Red text looks fuzzy"). And, we need solid actionable design recommendations. It is pretty easy to run a test and find issues. It's a bit harder to run a systematic review and spot issues. But it's a LOT harder to do the communication and design work to make the needed improvements happen. |
||||||||||||||