Preventing Data Deception

We are continuing our theme of data transparency and effectiveness with ‘Data Deception,’ which I describe as using large numbers without context to deceive someone into believing you are more successful than you are.

I wanted to write about this after listening to an episode of The Freakonomics Podcast, in which they discussed data fabrication. Data fabrication is a more nuanced and less inflammatory term for fraud in academic research. It means faking data or “cooking the books.” The second part of the podcast explored why people engage in this behavior in academia. There were several reasons, including pressure to complete the research, outcome expectations, and, most interestingly, the potential upside of media exposure.

One example of fabricated data was a study claiming that people were less likely to lie if they signed a form at the top before filling it in rather than after. This concept could be applied to areas prone to fraud, such as tax returns, insurance claims, or home improvements made for the purpose of selling. A company tested this theory but found no difference in the results. They conducted multiple tests, thinking they were doing something wrong, as this was counter to the much-hyped research.

Eventually, a group of people conducted similar research, using test cases such as the distance people drive, as insurance rates are often based on this factor. The sample set mainly consisted of seniors whose driving habits differed from those of younger people. This discrepancy led many to question the validity of the data. Ultimately, while the researchers did not admit to fabricating the data, they acknowledged that it may have been flawed.

The podcast emphasized that there are various incentives to manipulate data and compromise its integrity. As digital marketers with specific KPIs to improve performance and demonstrate value, are we always as transparent and diligent with the numbers? As a judge on industry awards, I often encounter individuals who fail to disclose exact numbers, instead presenting percentages, excessive rounding, or timeline adjustments.

My wife and I often joke about seeing exaggerated percentages, such as 100%, 50%, or 80%. We once attended a meeting in Japan where an agency presented impressive numbers for every ad they produced. The client loved it and readily accepted the data, even though it seemed too good to be true. High-performing results are a win for everyone. The client gets to tell the boss the project hit a home run, and the agency receives another project. The problem for us was that we had seen the actual data. While the ads had impressive performance numbers, sales for those items from that channel didn’t improve. Why wasn’t anybody asking that question?

The data was presented in the context of an identified goal of views and clicks rather than the implied goal of increasing sales. In a later meeting, a senior manager who was primarily concerned with sales asked the real question: how much sales and revenue it drove, and no one had the answer.

In another instance, the agency team was showing 500% growth. The agency team said, “We should celebrate this massive achievement with champagne.” As the client’s strategist, I had to ask, “What was before and after?” The person presenting had no idea of the background data to support the growth claim. The stat was being used to motivate the client and give the illusion of high performance. Since no one had the data, the client suggested sending a note to their team to get the answer.

Near the end of the meeting, the senior manager requested an update, and the senior account person wanted to follow up with them. He needed time to figure out how to spin it. Under pressure, he admitted that before, it was one click, and now it was five clicks. So, yes, it’s a 500% increase, but it was only five clicks. After reviewing the data more closely, we found that the $100,000 ad spend resulted in just over a thousand additional visitors, with no conversions to sales.

In this situation, everyone was incentivized to present and accept the data at face value. Analyzing and presenting data can be challenging, as most people tend to become bored easily, and everyone is eager to hear good news. Presenting information using unicorns and rainbows by using percentages and data out of context can give everyone a warm, fuzzy feeling. Still, it can also come back to haunt you later. The client is hearing fantastic news. The problem is that, at some point, great news about increased sales will have to be explained.

With these examples, there was not necessarily data fraud but rather data deception, where the more beneficial interpretation of the data was presented. This becomes a challenge for all involved to be mature about what we are measuring and how much detail and rigor is provided. I have discussed perceived economic value many times, which, in the scenario above, was reflected in an increase in sales. When success is presented in the context of a contributory metric, traffic, you are not demonstrating the actual perceived outcome. At some point, someone will say we spent this much money. We didn’t get anything back. We didn’t even recover what we spent. By setting up crisp metrics and more focused goals, you can minimize the illusion of success with percentages or other vanity metrics.