|
Although you remembered to use the "geometric mean" for time-on-task average, you still failed to give a "margin of error" for your test results. Usability fails to qualify the goal of saving 20 seconds in terms of a "margin of error".
1. Geometric mean: This first issue involves knowledge of how time-on-task data gets skewed.
Our typical definition of "mean" (or, "average") requires we assume that our data has about as many values below the mean as fall above the mean. Also, these below and above values are about the same distance from the mean, but in opposite directions. Technically, this gives us data values that follow a normal "bell-shaped curve".
However, when we use time-on-task data, this assumption often fails. The data gets lopsided, with some people taking much longer to do the task than expected. We fail to get a bell-shaped curve when looking at the distribution of the data values.
So this makes the mean (average) sensitive to the extra-slow performance of a few individuals. This affects the "average" much like the sale of a few $10 million dollar homes in your small town raises the average price of all the homes by hundreds of thousands of dollars. Clearly our understanding of "average" needs more clarification in this case.
For example, as a consequence of these problems, newspapers express "average" home prices as a "median" price. Thus, readers easily visualize half of all homes being priced above the median and half below. We get an intuitive feel for the "practical" average when we use the median as the average.
However, for time-on-task data statisticians use an even better averaging method called the geometric mean.
Simply put, we give more weight to the typical data points and less weight to the few outliers. The details shouldn't worry us here. We'll give you a Web-based calculator to do that math.
All said, most of your colleagues would never know to ask THIS question about the geometric mean. So it becomes your responsibility alone to lead them correctly..
2. Margin of error: This second issue occurred when both you and your comptroller failed to ask about the "confidence level" in your time-on-task results
Let's cover familiar ground first, then we'll apply it to our ROI calculations.
Think about how you read the results of voting polls. When 49% of prospective voters claim they will vote for Candidate A and 51% claim they will vote for Candidate B, which candidate is winning?
If someone says 51% wins the contest, you know they are probably wrong.
Newspaper polls always come with a "margin of error" typically plus or minus (+/-) 3%.
In our example, the candidates tied because +/- 3% causes the 49% and 51% to overlap.
3% plus 49% makes 52%. This overlaps with the 51% polled for the other candidate. Thus, we see there is no "true" difference between the two candidates.
But these statistics we read in the newspaper typically leave out an important point. Newspapers leave out the "level of confidence" that we get with a given margin of error.
They should include that these plus and minus ranges are calculated to give a 95% confidence level. That is, the interval between the low value and the high value encompass the true mean 95 times out of 100 similar polls.
That range certifies that your result will fall between the bottom margin of error and the top margin of error at least 95 similar polling events out of 100.
As you can imagine, more subjects gives you a better feel for the outcome. Fewer subjects give less confidence.
Note that to get a margin of error as small as +/-3% you need 1068 subjects. Whew..
And 385 subjects gets you a margin of error of +/- 5%. And 97 subjects gets you a margin of error of +/- 10%. Thus smaller numbers of subjects make the margin of error larger.

|