Since well before 1968 when the Hal 9000 (2001: A Space Odyssey) set the public's expectations for Interactive Voice Response Systems (IVRs), researchers built systems that respond to voice commands. In their simplest forms, these systems are cool. But no matter how cool, technologies only really take hold when they serve a business imperative. Recent hybrid systems that synthesize both grammatical and statistical models of speech recognition now interpret input reliably and accurately. These systems have gotten good enough that consumers are hearing natural language input systems more and more frequently. They are replacing both human agents and the seemingly ubiquitous and annoying "press 8 for..." touch-tone menu systems.
One approach to developing an effective natural language processing systems is to "naturally" limit the vocabulary of the customer. For certain environments, such as travel ticketing, the relatively closed set of words or tokens that the system needs to recognize simplifies the problem dramatically. For these domains, the computerized system can often negotiate the entire transaction. However, for more complicated dialogues, such as customer care and billing, the increased choices and substantially wider vocabulary make recognition less robust. Is there still a role for natural language recognition systems in these more complicated interaction environments?
Suhm and colleagues believe so. They compare the efficacy of voice interaction versus touch tone input. The comparison focuses on a system that uses voice recognition and categorization just to route the call to the right real person (Suhm, Bers, McCarthy, Freeman, Getty, Godfrey and Peterson, 2003). In their experiment, callers who used the baseline touch-tone menu system indicated their initial choices by selecting their desired routing from a list of options. They compared that group with a random subset of callers who were redirected to the speech-enabled IVR. Instead of hearing a list of options to select from, the speech group were instructed to "Please tell [the system] briefly, the reason for [their] call." (This prompt elicits more precise and interpretable responses than the more common: "May I help you?" according to unpublished research by Suhm, et. al.) Based on key words in the caller's response, the system would categorize their need and route them to either a specific agent or to an automated fulfillment system. Suhm and colleagues collected data from 95,904 callers who used the touch-tone IVR and 3,759 callers who experienced the natural language router.
Overall the accuracy rates for the first decision point were similar: Typically a well-designed touch tone system yields a 70-75% first choice accuracy rate; the speech-based system correctly categorized the call topic 78% of the time.
However, other benefits of the natural language IVR emerged immediately: 88.5% of callers invited to describe their reason for calling responded by doing so. In contrast, only 75.1% of callers to the touch tone system entered an initial selection. The remaining 24.9% immediately pressed "0" to escape the touch tone system.
Because it occasionally failed to recognize any key words in the caller's content, the speech-enabled system re-prompted callers more frequently than the touch tone system. This rerouting lengthened the call in the speech-enabled system—a taboo outcome for call center optimization. However, despite this increase, the overall average routing time for the natural language system was less than half that of the touch-tone IVR (16.5 seconds vs. 35.9 seconds, respectively). Further, callers got to the right place the first try: the natural language system was able to route callers to a more specific destination with fewer misdirects. This improvement is significant since every avoided misdirection saves the approximately 164 seconds that is required for callers to repeat their reason for calling to each new agent they are directed to.
Overall, Suhm and colleagues concluded that the natural language system improved the user experience, routing callers more accurately and more quickly to the right place. Users rated the speech system very positively, clearly preferring it to the touch-tone system in follow-up surveys.
In a similar study, Delude (2002) explored the interaction between aging and mode of input (touch-tone or voice). In her study, 22 university students and 22 seniors performed one task on each of 6 IVR systems (5 touch-tone and one voice activated system). The scenario trials were followed by a usability questionnaire.
All participants in her study completed at least one of the six IVR tasks. Interestingly, the distribution of success differed greatly between younger and older participants. 82% of younger participants completed 5 or 6 of the 6 tasks. While 32% of older participants completed 5 or 6 of the six tasks, 50% could complete only one or 2 of the 6 tasks. This suggests that while many older individuals will clearly successfully navigate IVRs, individual differences associated with cognitive aging are highlighted by the requirements of navigating IVR interfaces.
The types of challenges that users faced on the IVRs were similar for both younger and older participants. They included having difficulties with:
Among these, older individuals were most challenged by:
Most challenging for older individuals was that these difficulties tended to compound. Users who could not keep up with the choice alternatives tended to make errors that they could not recover from. In fact, overall, younger and older participants behaved similarly except that older individuals were not typically able to recover from errors.
For this study, the researchers predicted that participants would succeed more frequently on tasks that required fewer choices. This prediction held true for touch-tone input systems. However, the success rate for the voice driven system, which required the second highest number of choices to complete, produced the highest success of the tasks. According to Delude, "This exceptional result suggests that voice-activated IVRs do not follow the same rules as touch-tone IVRs."
Peissner (2002) suggests that the usability of natural language interaction systems will be determined by the interplay between:
So how can usability specialists decide the best approach to improving the user experience? Should they focus on tuning the voice recognition system, or on re-engineering/enhancing the dialogue.
To answer this question, usability specialists will have to develop methods for assessing the impact of word recognition accuracy, and dialogue design effectiveness. This will allow us to allocate our resources in the most effective way to enhance the overall usability of the system.
Delude, L. (2002). Automated telephone answering systems and aging. Behavior and Information Technology, 21(3), 171-184.
Roush, Wade (2003). Computers that Speak your Language. Technology Review. June, 23-39.
Peissner, M. (2002) What the relationship between correct recognition rates and usability measures can tell us about the quality of a speech application, Paper presented at Work With Display Units (WWDU).
Suhm, B., Bers, J., McCarthy, D., Freeman, B., Getty, D., Godfrey, K., and Peterson, P. (2002). A Comparative Study of Speech in the Call Center: Natural Language Call Routing vs. Touch-tone Menus. Paper presented at ACM SIGCHI, Minneapolis, Minnesota.
Congratulations on your excellent write-up on an important issue in the design of telephone voice user interfaces in your UI Design Update, July 2003!
You discuss an important tradeoff: a more "directed" dialogue, which steers callers towards saying just a few words, vs. an "open-ended" dialogue, which (seemingly) opens up the caller to say anything they like. The truth is that even with such "open-ended" prompts, what callers really do say is within a quite well bounded subset of general language, and only that fact makes it possible to develop systems that accurately interpret responses to open-ended prompts.
So how can usability specialists decide the best approach to improving the user experience? Should they focus on tuning the voice recognition system, or on re-engineering/enhancing the dialogue.
I'd like to point out that some of the questions raised at the end of the essay have already been studied, some in our own research. Our answer to the questions raised is: both need attention, but key is to obtain information from end-to-end calls, comprising both of the complete user-IVR interaction, as well as key pieces of information from any user-agent dialog that might follow. Refer to [Suhm, Peterson 2002: A Data-Driven Methodology for Evaluating and Optimizing Call Center IVRs, International Journal of Speech Technologies, Vol. 5, #1, pg. 23-37].
In the comparisons, was any attempt made to compare the success/ failure rate of different models of telephones? Part of the appeal of voice recognition is that you get to keep holding the receiver in a constant position where you can hear the prompts. The touchtone "press 4 / enter your PIN / spell your name" options become more difficult with phones whose buttons are integrated into the receiver, and I suspect the difficulty for the elderly (or anyone with impaired hearing) would be even greater. For example, my office phone system has a number you can call to reach an automated system where you are invited to spell the name of the person you are calling, using the phone's keypad. If the last name is not distinctive enough, you are instructed to continue spelling the first name. As soon as you have entered enough letters to make a unique pattern (an unpredictable number of letters), you get another prompt, which is very hard to hear if you are trying to spell a name on the buttons of your cell phone – holding it away from your ear – while riding a noisy subway. Failure to respond correctly may cause you to call the wrong person or to have to start over. On the other hand, voice recognition when using that same cell phone could be a problem if reception is poor.
Sign up to get our Newsletter delivered straight to your inbox
This Privacy Policy governs the manner in which Human Factors International, Inc., an Iowa corporation (“HFI”) collects, uses, maintains and discloses information collected from users (each, a “User”) of its humanfactors.com website and any derivative or affiliated websites on which this Privacy Policy is posted (collectively, the “Website”). HFI reserves the right, at its discretion, to change, modify, add or remove portions of this Privacy Policy at any time by posting such changes to this page. You understand that you have the affirmative obligation to check this Privacy Policy periodically for changes, and you hereby agree to periodically review this Privacy Policy for such changes. The continued use of the Website following the posting of changes to this Privacy Policy constitutes an acceptance of those changes.
HFI may use “cookies” or “web beacons” to track how Users use the Website. A cookie is a piece of software that a web server can store on Users’ PCs and use to identify Users should they visit the Website again. Users may adjust their web browser software if they do not wish to accept cookies. To withdraw your consent after accepting a cookie, delete the cookie from your computer.
HFI believes that every User should know how it utilizes the information collected from Users. The Website is not directed at children under 13 years of age, and HFI does not knowingly collect personally identifiable information from children under 13 years of age online. Please note that the Website may contain links to other websites. These linked sites may not be operated or controlled by HFI. HFI is not responsible for the privacy practices of these or any other websites, and you access these websites entirely at your own risk. HFI recommends that you review the privacy practices of any other websites that you choose to visit.
HFI is based, and this website is hosted, in the United States of America. If User is from the European Union or other regions of the world with laws governing data collection and use that may differ from U.S. law and User is registering an account on the Website, visiting the Website, purchasing products or services from HFI or the Website, or otherwise using the Website, please note that any personally identifiable information that User provides to HFI will be transferred to the United States. Any such personally identifiable information provided will be processed and stored in the United States by HFI or a service provider acting on its behalf. By providing your personally identifiable information, User hereby specifically and expressly consents to such transfer and processing and the uses and disclosures set forth herein.
In the course of its business, HFI may perform expert reviews, usability testing, and other consulting work where personal privacy is a concern. HFI believes in the importance of protecting personal information, and may use measures to provide this protection, including, but not limited to, using consent forms for participants or “dummy” test data.
Users browsing the Website without registering an account or affirmatively providing personally identifiable information to HFI do so anonymously. Otherwise, HFI may collect personally identifiable information from Users in a variety of ways. Personally identifiable information may include, without limitation, (i)contact data (such as a User’s name, mailing and e-mail addresses, and phone number); (ii)demographic data (such as a User’s zip code, age and income); (iii) financial information collected to process purchases made from HFI via the Website or otherwise (such as credit card, debit card or other payment information); (iv) other information requested during the account registration process; and (v) other information requested by our service vendors in order to provide their services. If a User communicates with HFI by e-mail or otherwise, posts messages to any forums, completes online forms, surveys or entries or otherwise interacts with or uses the features on the Website, any information provided in such communications may be collected by HFI. HFI may also collect information about how Users use the Website, for example, by tracking the number of unique views received by the pages of the Website, or the domains and IP addresses from which Users originate. While not all of the information that HFI collects from Users is personally identifiable, it may be associated with personally identifiable information that Users provide HFI through the Website or otherwise. HFI may provide ways that the User can opt out of receiving certain information from HFI. If the User opts out of certain services, User information may still be collected for those services to which the User elects to subscribe. For those elected services, this Privacy Policy will apply.
HFI may use personally identifiable information collected through the Website for the specific purposes for which the information was collected, to process purchases and sales of products or services offered via the Website if any, to contact Users regarding products and services offered by HFI, its parent, subsidiary and other related companies in order to otherwise to enhance Users’ experience with HFI. HFI may also use information collected through the Website for research regarding the effectiveness of the Website and the business planning, marketing, advertising and sales efforts of HFI. HFI does not sell any User information under any circumstances.
HFI may disclose personally identifiable information collected from Users to its parent, subsidiary and other related companies to use the information for the purposes outlined above, as necessary to provide the services offered by HFI and to provide the Website itself, and for the specific purposes for which the information was collected. HFI may disclose personally identifiable information at the request of law enforcement or governmental agencies or in response to subpoenas, court orders or other legal process, to establish, protect or exercise HFI’s legal or other rights or to defend against a legal claim or as otherwise required or allowed by law. HFI may disclose personally identifiable information in order to protect the rights, property or safety of a User or any other person. HFI may disclose personally identifiable information to investigate or prevent a violation by User of any contractual or other relationship with HFI or the perpetration of any illegal or harmful activity. HFI may also disclose aggregate, anonymous data based on information collected from Users to investors and potential partners. Finally, HFI may disclose or transfer personally identifiable information collected from Users in connection with or in contemplation of a sale of its assets or business or a merger, consolidation or other reorganization of its business.
If a User includes such User’s personally identifiable information as part of the User posting to the Website, such information may be made available to any parties using the Website. HFI does not edit or otherwise remove such information from User information before it is posted on the Website. If a User does not wish to have such User’s personally identifiable information made available in this manner, such User must remove any such information before posting. HFI is not liable for any damages caused or incurred due to personally identifiable information made available in the foregoing manners. For example, a User posts on an HFI-administered forum would be considered Personal Information as provided by User and subject to the terms of this section.
Information about Users that is maintained on HFI’s systems or those of its service providers is protected using industry standard security measures. However, no security measures are perfect or impenetrable, and HFI cannot guarantee that the information submitted to, maintained on or transmitted from its systems will be completely secure. HFI is not responsible for the circumvention of any privacy settings or security measures relating to the Website by any Users or third parties.
If a User’s personally identifiable information changes, or if a User no longer desires to receive non-account specific information from HFI, HFI will endeavor to provide a way to correct, update and/or remove that User’s previously-provided personal data. This can be done by emailing a request to HFI at hfi@humanfactors.com. Additionally, you may request access to the personally identifiable information as collected by HFI by sending a request to HFI as set forth above. Please note that in certain circumstances, HFI may not be able to completely remove a User’s information from its systems. For example, HFI may retain a User’s personal information for legitimate business purposes, if it may be necessary to prevent fraud or future abuse, for account recovery purposes, if required by law or as retained in HFI’s data backup systems or cached or archived pages. All retained personally identifiable information will continue to be subject to the terms of the Privacy Policy to which the User has previously agreed.
If you have any questions or comments about this Privacy Policy, you may contact HFI via any of the following methods:
Human Factors International, Inc.
PO Box 2020
1680 highway 1, STE 3600
Fairfield IA 52556
hfi@humanfactors.com
(800) 242-4480
HFI reserves the right to cancel any course up to 14 (fourteen) days prior to the first day of the course. Registrants will be promptly notified and will receive a full refund or be transferred to the equivalent class of their choice within a 12-month period. HFI is not responsible for travel expenses or any costs that may be incurred as a result of cancellations.
$100 processing fee if cancelling within two weeks of course start date.
4 Pack + Exam registration: Rs. 10,000 per participant processing fee (to be paid by the participant) if cancelling or transferring the course (4 Pack-CUA/CXA) registration before three weeks from the course start date. No refund or carry forward of the course fees if cancelling or transferring the course registration within three weeks before the course start date.
$100 processing fee if cancelling within two weeks of course start date. No cancellations or refunds less than two weeks prior to the first course start date.
Individual Modules: Rs. 3,000 per participant ‘per module’ processing fee (to be paid by the participant) if cancelling or transferring the course (any Individual HFI course) registration before three weeks from the course start date. No refund or carry forward of the course fees if cancelling or transferring the course registration within three weeks before the course start date.
Exam: Rs. 3,000 per participant processing fee (to be paid by the participant) if cancelling or transferring the pre agreed CUA/CXA exam date before three weeks from the examination date. No refund or carry forward of the exam fees if requesting/cancelling or transferring the CUA/CXA exam within three weeks before the examination date.
There will be no audio or video recording allowed in class. Students who have any disability that might affect their performance in this class are encouraged to speak with the instructor at the beginning of the class.
The course and training materials and all other handouts provided by HFI during the course are published, copyrighted works proprietary and owned exclusively by HFI. The course participant does not acquire title nor ownership rights in any of these materials. Further the course participant agrees not to reproduce, modify, and/or convert to electronic format (i.e., softcopy) any of the materials received from or provided by HFI. The materials provided in the class are for the sole use of the class participant. HFI does not provide the materials in electronic format to the participants in public or onsite courses.