|
Usability testing vs. heuristic evaluation: A head-to-head
comparison, Bailey, R.W., Allan, R.W. and Raiello, P., Proceedings
of the Human Factors Society 36th Annual Meeting (1992)
Usability evaluation and prototype fidelity: Users
and usability professionals, Catani, M. B., Biers, D. W., Proceedings
of the Human Factors and Ergonomics Society 42nd Annual Meeting, 1331-1335
(1998)
Difficulties in usage of a coffeemaker predicted
on the basis of design models, Rooden, M.J., Green, W.S. and Kanis,
H., Proceedings of the Human Factors and Ergonomics Society, 476-480 (1999)
Learning to predict human error: Issues of acceptability,
reliability and validity, Stanton, N.A. and Stevenage, S.V., Ergonomics,
41(11), 1737-1747 (1998)
|
How many of the usability problems identified in a heuristic evaluation
are true usability problems?
Several years ago, I published an article suggesting that many of the
"problems" identified by heuristic evaluators were not problems
at all (Bailey, Allan and Raiello, 1992). Even so, many of us have continued
to waste time and go to the expense of fixing many usability problems
that were not problems. Recently, three research papers were published
that provided some insights into the validity of heuristic evaluations
(Catani and Biers, 1998; Rooden, et.al., 1999; Stanton and Stevenage,
1998). The articles discussed usability testing in three totally different
domains with very similar results.
In all three of these papers it was possible to determine (a) what heuristic
evaluators thought the usability problems would be, and (b) compare their
responses with the problems that users actually had.
The results showed that 36% (about 1/3) of identified problems were actually
usability problems (hits). Some of these problems were serious and some
were trivial. The heuristic evaluators missed identifying about 21% of
the problems that users had. Finally, about 43% of the "problems"
that were identified by the heuristic evaluators did not turn out to be
problems at all (these are usually referred to as "false alarms").
In other words, of those identified as usability problems 46% were hits
and 54% were false alarms (and about 20% were missed altogether). If we
round off the numbers, we could conclude that when a heuristic evaluation
is conducted about half of the problems identified will be true problems
and about half will not be problems. More specifically, for every true
usability problem identified, there will be a little over one false alarm
(1.2), and there will be about one-half of one missed (.6). If this analysis
true, heuristic evaluators tend to identify more false alarms and miss
more problems than they have true hits.
I believe that the best way to initially find true usability problems
in a website is to use research-based heuristics. This obviously requires
that the evaluators understand the usability research. (For a more complete
discussion on this topic see my May, 1999
newsletter.) All heuristic evaluations should be followed-up with
well-designed and professionally executed performance tests.
|