Lying with Data
🤥

Lying with Data

Description

Story & Bias

I love books about math, particularly those by Michael Lewis. Liar’s Poker, Moneyball, The Big Short, etc. The central friction of much of Lewis’ work is the disparity between story and economic valuation. In “Moneyball,” traditional baseball was building teams based on player attributes like how attractive a player’s girlfriend is, rather than statistics. In “The Big Short,” big banks had created a poor financial instrument in mortgage “junk bonds” which caused the housing market collapse in 2008.

This central tension resonates with many analysts. Instead of valuation based on statistical analysis or the economics, any given field of business value (marketing, player, financial analysis, etc) instead becomes irrationally biased due to the effects of story. For example in the field of valuation, major corporations will weave narratives about themselves to shape how investors “feel” about the business, thus creating bias and increasing the value of shares.

Inevitably, an “invisible hand” corrects any given market. But no one can accurately predict how long it will take for a market to correct itself. In the case of baseball for example, it took over 20 years to apply the analytics of Sabermetrics from the works of Bill James. That causes incredible frustration for many analysts because markets act irrationally. Creative solutions can sometimes be derived to profit from that irrationality, but that’s not always the case.

Analysis & Bias

And yet all effective analysis is storytelling as well. Stories are what help us understand concepts. The take the abstract and codify the wisdom. Look at any religious institution for example. Analysts must tell stories with the data - if they can’t they fall to stories told with subjective opinion. I want to lie a little with data.

Why? To illustrate the power of story.

Python Code

Don’t look at the numbers, just look at the diagrams.

image
Company 1
image
Company 2

Which company is having a better time? It’s the same company.

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 3.1, 3.2, 3.1, 3.4, 3.5, 3.6, 3.7, 3.6, 3.8])

plt.xlabel("Year")
plt.ylabel("Profit (in millions)")

plt.plot(ypoints)
plt.show()
import matplotlib.pyplot as plt
import numpy as np

points = np.array([3, 3.1, 3.2, 3.1, 3.4, 3.5, 3.6, 3.7, 3.6, 3.8])

plt.xlabel("Year")
plt.ylabel("Profit (in millions)")
plt.ylim(0, 5)
plt.plot(points)
plt.show()

Every time we explain results, opportunity to valuate something - it’s all storytelling. Much like an analyst has to “fit” data to a machine learning model, the resulting data must “fit” into a story. An analyst has a responsibility to a company to best represent the results of the data to effectively guide decisioning processes. They need to be storytellers.

Yet we get frustrated that other stories sometimes dominate a field. If it were as simple as presenting raw, tabular data, and manager could come to their own conclusions. However:

  • Company management is typically not analytical besides perhaps around financial data (MBAs) and the field of expertise they originate from.
  • Analysts are deep in the data. Daily. They are more attuned to being able to contextualize the economics at play from various fields.
  • They also are more likely to detect patterns over time. When you life is looking at data, and you are analytically minded, you are more likely to “see” clearly than someone who maybe opens a weekly report in their mailbox.

And yet, math-oriented people who are often in the data decry story. It’s an interesting paradox.