Site MapUser Experience for a Better World published in The Journal of Electronic Commerce, Volume 12, Number 2
Below, we display a chart of the performance results for easy comprehension. The chart requires tabulation of results (see analysis, below).
Figure 1 Below, we display subject comments for the two tasks failed by all participants but one. Typically, subjects will not be clear on the source of their confusion. The designer or usability specialist must evaluate each comment for its design significance. Occasionally, a subject offered no comment. We only present two examples here. 13 Failed to "Change PIN" (Task 1 – Each bullet is from a different fail subject.)
13 Failed to "Review Fees" (Task 2 – Each bullet is from a different fail subject.)
Recall that we also collected statements regarding subjects "overall" impression. Note in the following, that 6 subjects gave favorable statements. They represent 44% of the subjects. Clearly, subjective impressions can be misleading even in the face of severe usability problems. 6 Positive Post-test "Overall Impressions" (Each bullet is from a different positive subject.)
8 Negative Post-test "Overall Impressions" (Each bullet is from a different negative subject.)
|
||
| Step 4. Collect Satisfaction Data | After the test protocol, each subject filled out a satisfaction questionnaire. (See next page.) Because the subject had just attempted 10 tasks, they could easily reflect on their subjective reactions. The questionnaire represents five categories of satisfaction (discussed below). We altered the original questionnaire (from other sources) to accommodate IVR technology. |
|
| Step 5. Analyze Performance Data | In this case, our goal is to show data indicating the extent of the IVR problems. The results guide whether to make design changes or not. Changes themselves presume expert knowledge of IVR design. We represent this phase of the test with the following summary. Tasks 1. Change PIN Overall Results
We scored any task as "fail" if the subject used a different IVR option than what the designer intended. Often, a subject would get a CSR, thinking it was a planned event. While the subject felt a positive outcome, the IVR had failed. We also logged the level of menu at which the subject failed to press the correct phone button. Notably, 40% of all the failures occurred on the first menu item – a prime target for improvement. We indicate that 140 "test events" constitute the body of evidence. In our data analysis we learned our subjects only passed 27.1% – 38 of the 140 events. Since 22 of those 38 passes occurred in only 2 of the test items, we restated the results. We point out that the 8 worst items merited only 11.6% passing rate – only 16 out of 112 test events. Pretty expensive. The CSR staff has to work hard to keep up with the callers asking for human help. |
|
| Step 6. Analyze Satisfaction Data | We grouped and averaged data from the satisfaction questionnaire as follows. Where a question implied a negative response, the answer was mathematically converted to match the meaning of the group description. If you use the questionnaire, calculate the mean of the following questions for each category. a. Learnability: 1, 6, 12, 8, 20 (high priority) Global Metric: .3a + .3b + .1c + .1d + .2e We established a yardstick of positive merit based on the nature of the scale. Recall that the subjects selected a number from 1 to 7, with 4 representing a neutral point. The next interval above the neutral 4 is 5. Therefore, we set a score of 5 or above as indicating "positive" rating. (See Figure 2)
Figure 2 |
|
| Top | ||