Means and Medians: When To Use Which
It is an age-old question in data representation: should I use the mean or the median?
Statistical analyses can be a slippery slope. Many human-centered research professionals must utilize a deep level of understanding to analyze the results of their testing. But what many people don’t realize, however, is that even simple statistics like the mean or median are frequently misused. Here’s some info to help prevent you from being one of those offenders.
Means and medians are measures of central tendency. In the most general terms, the goal is to provide a quick-and-easy representation of your data. It’s a summary of the results, but bunched into one number.
The mean, of course, is better known as the average. It is the most well-known measure of central tendency. Because it can represent an entire data set with the least amount of error, displaying the mean is incredibly popular. But, the major downside of the mean is its vulnerability to outliers and skewed data. Consider the following collection of usability test scores for two in-vehicle infotainment system prototypes. The metric of interest here is Time On Task (TOT), collected using a within-subjects research design.
The means of these scores are 36.45 seconds (A) and 29.75 seconds (B). Are those representative of the systems’ actual performances? Well, for System A, I’d say “sure”; 36 seconds or so looks like a pretty fair summary of TOT performance, given the frequency histogram. System B? Ehh, maybe. It sure looks like general performance in System B was even better than 29.75 seconds. I would expect something as low as 25 seconds even. But no, those two outliers up there at 94 and 98 seconds really change things up. This might be OK, but it totally depends on what the goal is with the data representation. Reporting the mean as 29.75 seconds might not be the best way to do it.
The median is the 50th percentile of the data. Exactly half of the data points are less than the median, and exactly half of the data points are greater than the median. It’s right in the middle and it is not affected by outliers or skewed data.
If we take a look at the same data set from above, we find that the medians are 34.5 seconds for System A and 23.5 seconds for System B. Because the spread from System A followed a normal distribution, the mean of 36.45 and 34.5 are quite similar. While most people would provide the mean for System A, in this situation it would be appropriate to do either.
We have a completely different story for System B. While the mean came in at 29.75 seconds, the median provides a much lower measure of central tendency because the outliers don’t affect it as drastically. As I interpret System B’s results, at least, the median provides a more accurate representation of how well users performed.
There is another perspective to take here also. I came across an interesting article regarding means and medians; the author, Learn and Teach Statistics user Dr. Nic holds a unique perspective: He argues that while the mean is more commonly used than the median, he is unable to find a situation where the mean was better than the median. The article can be found here.
I think Dr Nic’s view is somewhat heavy-handed because there are indeed times when the mean is useful. But, I see what he’s saying. Maybe it’s time the median got some glory; means can actually be a pretty crappy measure of central tendency, and medians aren’t as susceptible to the things that make it crappy.
Which one is better?
I’d argue that the mean can be better when you want to include the outliers. In our usability test example, most of the participants were able to complete the task within about 30-40 seconds. Some, as we have noted, took quite a bit longer and obviously had trouble finding their way through the task. If we were to 1) discard those two outliers, or 2) use the median as a representation of that data, those two participants’ data points are essentially ignored. Just because only two participants struggled greatly on the task, do we now not care about the data accumulated from their experiences? By casting the outliers’ weight aside, the median simply isn’t getting the whole picture. In fact, if anything, when it comes to usability, those two people’s data might be more important than the remaining dataset. Obviously something happened to make those two users’ experience go awry! The goal is to make the experience better for everyone, right? Ignoring their data wouldn’t be fair to the analysis or ethical, for that matter.
So we come back to the essential question: which one should you use? The shortest answer I can provide is also probably the least helpful; “Choose the method that most accurately represents the data.” If outliers distort the correct perception of the data, then either use the median or toss the outliers and use the mean. If the outliers are relevant to the broader story, the mean may be the way to go.
Here are a few quick-hit rules of thumb:
- Mean better when the data follows classical assumptions
- Median better when data is skewed in a direction
- Median better when there are outliers
- Mean is a parametric estimate (meaning) and is most useful when you know the shape of the distribution
- Median is non-parametric (meaning) and there is no need for a distribution shape when used
- Mean summarizes using all the data
- Median uses only one value