Introduction

How many of the usability problems identified in a heuristic evaluation are true usability problems?

Several years ago, I published an article suggesting that many of the "problems" identified by heuristic evaluators were not problems at all (Bailey, Allan and Raiello, 1992). Even so, many of us have continued to waste time and go to the expense of fixing many usability problems that were not problems. Recently, three research papers were published that provided some insights into the validity of heuristic evaluations (Catani and Biers, 1998; Rooden, et.al., 1999; Stanton and Stevenage, 1998). The articles discussed usability testing in three totally different domains with very similar results.

In all three of these papers it was possible to determine (a) what heuristic evaluators thought the usability problems would be, and (b) compare their responses with the problems that users actually had.

The results showed that 36% (about 1/3) of identified problems were actually usability problems (hits). Some of these problems were serious and some were trivial. The heuristic evaluators missed identifying about 21% of the problems that users had. Finally, about 43% of the "problems" that were identified by the heuristic evaluators did not turn out to be problems at all (these are usually referred to as "false alarms").

In other words, of those identified as usability problems 46% were hits and 54% were false alarms (and about 20% were missed altogether). If we round off the numbers, we could conclude that when a heuristic evaluation is conducted about half of the problems identified will be true problems and about half will not be problems. More specifically, for every true usability problem identified, there will be a little over one false alarm (1.2), and there will be about one-half of one missed (.6). If this analysis true, heuristic evaluators tend to identify more false alarms and miss more problems than they have true hits.

I believe that the best way to initially find true usability problems in a website is to use research-based heuristics. This obviously requires that the evaluators understand the usability research. (For a more complete discussion on this topic see my May, 1999 newsletter.) All heuristic evaluations should be followed-up with well-designed and professionally executed performance tests.

References

Usability testing vs. heuristic evaluation: A head-to-head comparison, Bailey, R.W., Allan, R.W. and Raiello, P., Proceedings of the Human Factors Society 36th Annual Meeting (1992)

Usability evaluation and prototype fidelity: Users and usability professionals, Catani, M. B., Biers, D. W., Proceedings of the Human Factors and Ergonomics Society 42nd Annual Meeting, 1331-1335 (1998)

Difficulties in usage of a coffeemaker predicted on the basis of design models, Rooden, M.J., Green, W.S. and Kanis, H., Proceedings of the Human Factors and Ergonomics Society, 476-480 (1999)

Learning to predict human error: Issues of acceptability, reliability and validity, Stanton, N.A. and Stevenage, S.V., Ergonomics, 41(11), 1737-1747 (1998)

Cool stuff and UX resources

Introduction

References

Leave a comment here

Subscribe

Follow us

Cool stuff and UX resources

Heuristic evaluations vs. usability testing

Introduction

References

Leave a comment here

Subscribe

Follow us