Data Types: Introduction
Back in 1946, a Psychologist named Stanley Smith Stevens determined that there were four distinct data types: ratio, interval, ordinal, and categorical/nominal. These data types categorize different variables and give a direction as to how one should analyze those variables.
We use this type of data to label, or
Because of these traits, we can only compute two types of analyses for categorical/nominal data: mode and frequency. The
We use this type of data when we are ranking, or ordering, things. For instance, ordinal data corresponds to things like where competitors place in a race. The numerical distance between the data points (e.g., first and second place) doesn’t mean anything. That is, the difference between first and second place on the podium doesn’t tell us anything about how much faster one person was than the other. It only tells us that one came before (or after) the other.
Like nominal/categorical data, there are limitations to the types of analyses that can be conducted with ordinal data. However, in addition to calculating the frequency of any given value, ordinal values allow us to generate certain measures of central tendency like percentiles and medians.
We use interval data when we care about the numerical difference between data points, but don’t have a true zero point as a shared reference for both sets of data. For example, let’s say your two children get sick and you want to compare their temperatures. Cindy has a temperature of 103 degrees Fahrenheit, while Johnny has a temperature of (approximately) 98 degrees Fahrenheit. Interval data allows us to say that Cindy is 5 degrees Fahrenheit hotter than Johnny. However, it doesn’t permit us to say that Cindy is 105.1% hotter than Johnny.
This is because the Fahrenheit scale doesn’t have a true zero point — meaning, a point on the scale in which there is no possible lower point. Remember, Fahrenheit can go into negative values too. If, however, we were to measure Cindy and Johnny’s temperature in Kelvin instead — which does have a true zero point — we could reference the two temperatures directly against one another (but this would no longer be interval data!).
The big bonus associated with interval data is that we can calculate all the primary measures of central tendency, which is something crucial in conducting usability research. Specifically, interval data allows us to calculate average scores — something that is of particular relevance in within and between subject designs.
We use ratio data when there is a true zero point. Things like time, or number of errors on a task, are examples of ratio data. Since this type of data starts from an actual zero starting point, we can compute ratios based on them. For example, we can say Participant 1 made 1.5 times (or 50%) more errors than Participant 2.
What does all of this mean? It means that we have to treat these data types differently during analysis. Here is a handy table to remind you what you can do to analyze the different data types: