Every day in offices around the world usability professionals ask and are asked this question: How many users do we need for our usability test? Its an important question. We want to find most of and the most severe problems. So, we need to test enough people. But usability testing is so expensive, and the cost of testing increases with each participant. So, we don't want to test too many, either.
On the one hand, synthesizing the received theoretical wisdom suggests that there is an answer to this question. And answer is "5." (Virzi 1992; Nielsen and Landauer, 1992) That is, based on a probabilistic formula, you will need to test 5 users to find about 85% of the problems that will trip up 1/3 or more of your users. The number 5 is very concrete. Practitioners like it. 5 is easy to remember.
On the other hand, this question gets debated every year at the CHI conference. You can count on it.. Like death and taxes. The same debate. Given that the UX community (re-)debates this every year, it seems that the wisdom has not been so well received.
That the number 5 has such staying power says something interesting about human memory and the way people reason. The 5-formula can work. But, like tossing a coin, it's probabilistic. If you keep flipping a coin over and over, it will come up heads half the time. But it can also come up tails nine times in a row.
Similarly, if you run enough usability tests with 5 users, on average you will find most of the errors about most of the time. But if you run only one test (or just a few) with 5 users, it's possible that you will uncover fewer errors than the formula projects. (Spool and Schroeder, 2001; Faulkner, 2003, or you are less ambitious, there is the May, 2004 newsletter.)
There are other challenges with the 5-formula. For instance, to calculate the number of testing participants you need, a priori you need to know how many problems there are to find. If you knew that, likely you wouldn't need to test to find them, eh?
Not surprisingly, the debate churned on in San Jose (CHI 2007). But this year, Lindgaard and Chattratichart (2007) threw down a different gauntlet. The obstacle to solving the problem, they said, is the question. "How many users" is the wrong way to think about it.
In usability testing, we are looking for mismatches between the site/app model and the user's mental model on the key and critical tasks. Framed this way, the criterion that determines how many problems get uncovered is how many tasks participants try, not how many participants there are.
To test their claim, Lindgaard and Chattratichart reanalyzed the usability testing data from CUE-4* (Molich, 2003 – Workshop Reference). Within that project, 9 highly experienced teams used think-aloud techniques to independently test the same site. The teams received identical input from the coordinators (site objectives, problem criteria, testing focus). Each team shaped their own testing plan and protocol, conducted the testing, and aggregated the findings into a pre-determined feedback format.
Lindgaard and Chattratichart looked for similarities and differences across the methods and findings reported by each team. Specifically, they were seeking relationships between test design (e.g., # users, # tasks) and number of problems identified.
Their study reports that there was no reliable correlation between the number of users tested and the number of usability problems uncovered. Testing more users did not ensure that that more problems would be discovered. Further, although each of the 9 teams tested 5 users or more, they reported only 7-43% of the known problems, not the 85% predicted by the 5-formula.
In contrast, their analysis showed a significant positive correlation between the number of tasks evaluated and the number of problems uncovered. That is, the more tasks a team included in their testing protocol, the more problems they uncovered.
They conclude that other things being equal (e.g., quality of recruiting), the better predictor of the productivity of usability testing is the number of tasks participants (try to) complete, not the number of participants who try to complete them.
* The CUE Studies, Molich and Dumas, in press; Molich, Kaasgaard and Karyukin, 2004, among others, compare methods and findings of different teams conducting the same usability test. CUE findings show that different usability testing teams evaluating the same interface report different numbers usability problems, often with very little overlap in the identified. There's clearly more to it than number of users.
Faulkner, L. Beyond the five-user assumption: Benefits of increased sample sizes in usability testing. Behavior Research Methods, Instruments & Computers, 35, 3, Psychonomic Society (2003), 379- 383.
Lindgaard, G. and Chattratichart, J. Usability Testing: What Have We Overlooked? CHI 2007 Proceedings, ACM Press (2007).
Molich, R. & Dumas, J. S. Comparative Usability Evaluation (CUE-4). Behaviour & Information Technology, Taylor & Francis (in press).
Molich, R. & Jeffries, R. Comparative expert review. In Proceedings CHI 2003, Extended Abstracts, ACM Press (2003), 1060-1061.
Molich, R., Ede, M. R., Kaasgaard. K., & Karyukin, B. Comparative usability evaluation. Behaviour & Information Technology, 23, 1, Taylor & Francis (2004), 65-74.
Nielsen, J., & Landauer, T. K. A mathematical model of the finding of usability problems. In Proceedings of INTERCHI 1993, ACM Press (1993), 206-213.
Spool, J. & Schroeder, W. Testing Websites: Five users is nowhere near enough. In Proceedings CHI 2001, Extended Abstracts, ACM Press (2001), 285-286.
Virzi, R.A. Refining the test phase of usability evaluation: How many subjects is enough? Human Factors, 34, HFES (1992), 457-468
I have for years read the debates over which number of users is statistically significant, and yes the minimum of 5 has always been a safe bet. But seriously I don't see how increasing the number of tasks is more beneficial. I am a designer for applications for which I generally test a scope of tasks per feature and there is no need for more tasks when my goal is to a test specific set of features that can be completed with a finite set of tasks. Also the article references sites; in my experience the types of tests we perform to validate the usability of a site are often different than the types of measures used to test the usability of product applications. I think this distinction needs to be made.
Can you please comment on the selling of this idea to clients – three groups of 6-12 participants. This would be helpful because every different user groups you recruit adds to the cost. Is it advisable to separate out as common tasks across groups and special tasks per specific group? Somewhere it also connects to the maturity of usability practice within an organization. Your recommendations can help practitioners in companies sell this to their own management or clients. If an end-to-end software solution provider needs the bandwidth to address usability in projects, are there more automatized tests or techniques that can be provided?
Sign up to get our Newsletter delivered straight to your inbox
HFI may use “cookies” or “web beacons” to track how Users use the Website. A cookie is a piece of software that a web server can store on Users’ PCs and use to identify Users should they visit the Website again. Users may adjust their web browser software if they do not wish to accept cookies. To withdraw your consent after accepting a cookie, delete the cookie from your computer.
HFI believes that every User should know how it utilizes the information collected from Users. The Website is not directed at children under 13 years of age, and HFI does not knowingly collect personally identifiable information from children under 13 years of age online. Please note that the Website may contain links to other websites. These linked sites may not be operated or controlled by HFI. HFI is not responsible for the privacy practices of these or any other websites, and you access these websites entirely at your own risk. HFI recommends that you review the privacy practices of any other websites that you choose to visit.
HFI is based, and this website is hosted, in the United States of America. If User is from the European Union or other regions of the world with laws governing data collection and use that may differ from U.S. law and User is registering an account on the Website, visiting the Website, purchasing products or services from HFI or the Website, or otherwise using the Website, please note that any personally identifiable information that User provides to HFI will be transferred to the United States. Any such personally identifiable information provided will be processed and stored in the United States by HFI or a service provider acting on its behalf. By providing your personally identifiable information, User hereby specifically and expressly consents to such transfer and processing and the uses and disclosures set forth herein.
In the course of its business, HFI may perform expert reviews, usability testing, and other consulting work where personal privacy is a concern. HFI believes in the importance of protecting personal information, and may use measures to provide this protection, including, but not limited to, using consent forms for participants or “dummy” test data.
HFI may use personally identifiable information collected through the Website for the specific purposes for which the information was collected, to process purchases and sales of products or services offered via the Website if any, to contact Users regarding products and services offered by HFI, its parent, subsidiary and other related companies in order to otherwise to enhance Users’ experience with HFI. HFI may also use information collected through the Website for research regarding the effectiveness of the Website and the business planning, marketing, advertising and sales efforts of HFI. HFI does not sell any User information under any circumstances.
HFI may disclose personally identifiable information collected from Users to its parent, subsidiary and other related companies to use the information for the purposes outlined above, as necessary to provide the services offered by HFI and to provide the Website itself, and for the specific purposes for which the information was collected. HFI may disclose personally identifiable information at the request of law enforcement or governmental agencies or in response to subpoenas, court orders or other legal process, to establish, protect or exercise HFI’s legal or other rights or to defend against a legal claim or as otherwise required or allowed by law. HFI may disclose personally identifiable information in order to protect the rights, property or safety of a User or any other person. HFI may disclose personally identifiable information to investigate or prevent a violation by User of any contractual or other relationship with HFI or the perpetration of any illegal or harmful activity. HFI may also disclose aggregate, anonymous data based on information collected from Users to investors and potential partners. Finally, HFI may disclose or transfer personally identifiable information collected from Users in connection with or in contemplation of a sale of its assets or business or a merger, consolidation or other reorganization of its business.
If a User includes such User’s personally identifiable information as part of the User posting to the Website, such information may be made available to any parties using the Website. HFI does not edit or otherwise remove such information from User information before it is posted on the Website. If a User does not wish to have such User’s personally identifiable information made available in this manner, such User must remove any such information before posting. HFI is not liable for any damages caused or incurred due to personally identifiable information made available in the foregoing manners. For example, a User posts on an HFI-administered forum would be considered Personal Information as provided by User and subject to the terms of this section.
Information about Users that is maintained on HFI’s systems or those of its service providers is protected using industry standard security measures. However, no security measures are perfect or impenetrable, and HFI cannot guarantee that the information submitted to, maintained on or transmitted from its systems will be completely secure. HFI is not responsible for the circumvention of any privacy settings or security measures relating to the Website by any Users or third parties.
Human Factors International, Inc.
PO Box 2020
410 W Lowe Ave
Fairfield IA 52556