I was listing to the radio the other day when a story was being presented about the different in opinion among age groups. The nice authoritative sounding voice on the radio stated that group A had a positive opinion about the topic in question with 60% responding 'yes' . Group B was presented as having a negative opinion on the topic with 42% of group be responded 'no'. The way it was presented - verbally and with emphatic language stating the clear difference between the groups was very convincing. However, think about the numbers - one group at 60% yes, another group at 42% no. This being a yes or no question - the responses from these two groups are almost identical - 60% yes vs 58% yes. I was fascinated at how easily the results could be misrepresented - without altering the data - to support a position.
Ever heard the phrase "the data doesn't lie?" Nope. Forget what you learned growing up about data being objective. It isn't. Like most information, it can be spun and manipulated to present different angles on a story.
The technique presented above isn't entirely uncommon.
1. Inverting data to emphasize a point. Often used in yes/now, positive/negative data sets you will find folks presenting data in a way to support a position. For example let's showing that 55% of respondents were in favor of a referendum. If one wanted to make a case against a particular referendum then stating that 45% of respondents were opposed can paint the referendum in a negative light.
2. Playing with scale. When presenting data graphically, the choice of axis scale can have a big impact on how the data is perceived. Take a look at the two charts below. The chart on the left shows a sizable spike in the middle whereas the chart on the right is relatively flat. However, both charts were generated using the exact same data. The only difference is the range on the y-axis.
3. Misuse of pie charts. This is one of my favorites. In all fairness, I usually see this done less to be out of an attempt to mislead and more out of a lack of understanding of how the charts works and the nature of the data. As an example, lets say I conduct a survey and ask people to choose what type of salty snack they like - potato chips, pretzels, Doritos, tortilla chips. They can choose as many as they like. The results are 55% of respondents like potato chips, 20% like pretzels, 50% like Doritos, and 40% like tortilla chips. (totally made up data, by the way). You may encounter people using a pie chart to represent the results, like the chart below. The problem here is that the responses are not mutually exclusive, so the percentages do not add up to 100%.
4. Accumulating non-cumulative data. Not all data can be added up. I want to measure how many people play basketball in my community each month this winter. December I count 220 people, January I count 230 people, February I count 180 people. Awesome. Now if I am asked how many people played basketball total this winter, the answer is most definitely not 630. The answer is ... you don't know.