## What do these numbers really mean?

“Cold hard facts” like 32 degrees, 26 touchdowns, and 8 billion dollars trip us up. But what can be so confusing about simple numbers?

Numbers anchor our thinking. The run-up of mortgage interest rates drew the headlines rather than the typical monthly payment. Humans are relative thinkers and initial numbers frame our thinking.

There are also contextual clues to each number we see. Forty degrees can be cold or warm depending on the humidity, sunlight, wind, precipitation as well as our exertion. Ideal marathon conditions are for the runners, not the spectators.

Lastly, numbers represent distributions of outcomes. We’ve seen this with Aaron Rodgers’ touchdown tails:

And two other January 2022 news stories. The University of Georgia football team was favored to win the national championship game by fourteen points. They won by fifty-eight.

Rather than a large error, we can think of the fourteen-point betting line as a fulcrum. That was the point that balanced bets between the most common forecast: a close TCU victory or a Georgia blowout.

Another is the estimation that Chat GPT is worth twenty-nine billion dollars. It’s not, said Ben Thompson. There’s a one-sixth chance it’s worth two hundred billion.

🔢

Numbers carry more meaning than we typically assign. Life’s numbers are presented by accountants – and we need to think like auditors.

Other posts in the numeracy series include handshake puzzles and birthday bets, the problem with hurricane categories, and white water whitewash.

There are many good books about these ideas like Tim Harford’s Data Detective and the new Covid by the Numbers by David Spiegelhalter who wants us as auditors to ask, why am I seeing this number?

## Handshake puzzles and Birthday bets

Warning: use these tools wisely, they can keep nieces and nephews busy for hours

What is the sum of all the numbers from 10 to 1? That’s difficult.

One repeated theme – because it works! – is reframing. The same information but a new presentation changes our understanding.

What is the sum of these numbers: 10, 9, 1, 8, 2, 7, 3, 6, 4, 5? That’s easier. Reframe again, this time visually.

Each what is the sum question can be framed as a triangle. But reframe again, first to staircases, then as pairs.

Doesn’t this feel like magic? The universe presents this little nugget (What is the sum of all the numbers from 10 to 1?) and rather than slog through we skip over. Reframing numbers into shapes changes our tool from addition to multiplication. Magical. We went from brute force to clever pairing to formulaic: (n*(n+1))/2.

Another question: In a group, how many handshakes must occur for everyone to shake everyone else’s hand?

Two people have one handshake. Five have ten connections.

Like triangle numbers, it’s almost the same math! Rather than counting the dots, we want the connections. Rather than (n*(n+1))/2, the formula is (n*(n-1))/2.

It’s not only handshakes to count but games in a round robin, cables to connect computers, and shared birthdays.

In a group of 31 people, what are the chances any two people share a birthday? Thirty-one is a good number because it frames our thinking. “That’s like one month, so one-in-twelve”.

Ha!

The chances are closer to three-in-four thanks to our connections.

There’s a 99.7% (364/365) chance two people have different birthdays, (.997)1 (connection). The chance of five people having different birthdays, (.997)10, is 97%. Even the chance ten people have different birthdays, (.997)45, is 87%.

But keep going. Thirty-one people have 465 connections and a 25% chance of differing birthdays.

Every day on Twitter, the joke goes, someone is the main character – and you don’t want it to be you. Something is always happening because of this birthday/games/handshake structure. It’s easier not to get wrapped up in “this headline” knowing there will always be headlines.

## It’s not the fall…

It’s not the fall that gets you, it’s the impact at the end.

The best metrics describe a state of the world. Hotness has three audiences. “Fahrenheit is basically asking humans how hot it feels. Celsius is basically asking water how hot it feels. Kelvin is basically asking atoms how hot it feels.” (Reddit

Another is the contrast between American and Canadian avalanches. In the states, a medium avalanche is “relative to the path”. In Canada a medium avalanche “could bury a car, destroy a small building, or break a small tree.” The southern system expects the audience to be familiar with the area

A third is calorie counts. Sure, bananas have calories but they can also be zero-point foods. Counting calories isn’t the point. Weight loss is the point, so what’s the best way to communicate information that leads to those actions?

About hurricanes: “The Saffir-Simpson Hurricane Wind Scale we use to rate hurricanes is based on only wind and doesn’t take into account the chief hazards which can be storm surge and flooding rains. It’s wholly inadequate. We need to go away from rating one through five based on winds. Hurricane Harvey stalled for days over Texas and caused a hundred billion dollar plus disaster (Harvey made landfall as a category four storm, but most of the rainfall and damages was as a tropical storm).” – Dr. Jeff Master

Hurricanes need to be rated more like Canadian avalanches. If measuring wind speed rather than rainfall wasn’t enough, there’s another problem: the hundred-year storm

Everywhere I’ve lived has a hundred-year storm. In Ohio (Southeast and Northwest) it was floods. In Florida it is hurricanes. Imagine a category five hurricane that hits Beach City once every hundred years. What chance is there for a storm of that level in the next thirty years?

Master walks through this math. A one-percent chance each year is a ninety-nine percent non-chance. Multiply a ninety-nine percent non-chance thirty times and the hundred-year storm has a 26% chance of occurring in a thirty-year window.

It’s not the hurricane winds that get you, it’s the flooding afterward. Yet we measure the winds.

## Thinking paths and more

An athlete shoots 70%. If they shoot twice, what are the chances they make at least one? 🤔

Before answering, consider thinking. Daniel Kahneman has an entire book about Thinking, Fast and Slow. Fast thought is immediate. Slow is deliberate. Often ‘thinking fast’ about thinking fast and slow is that slow is better.

That’s not the case. Lots of fast thought works well.

One problem with Kahneman’s book – which he admits, Kahneman is a scientist and when the evidence changes his understanding does too – is the social science replication crisis. Some studies don’t repeat. Or repeat quirkily. For example: Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

Which is more probable? (1) Linda is a bank teller. (2) Linda is a bank teller and is active in the feminist movement.

There’s a lot in there. But our fast reaction goes something like: If this information is here it must be important. Answer number two. That’s how we think.

But take the same Linda is 31 years old… prompt and ask this question: There are 100 persons who fit Linda’s description. How many of them are: (1) Bank tellers? __ of 100 (2) Bank tellers and active in the feminist movement? __ of 100.

Phrased that way the conjunction fallacy goes away.

Thoughts are path dependent. Reframing changes the path.

An athlete misses 30% of the time. If they shoot twice, what are the chances they miss both? 🤔 Well they miss thirty percent of the time. To miss both it would be 30%*30%, which equals 9%. So to the original question, this athlete will make at least one more than ninety percent of the time.

If riddles are a good proxy, there are two tools: intuition and presentation. Intuition is internal. How many mental models do we have? How numerate are we? What’s our (ongoing) education? Presentation is external. What are the norms? What’s the phrasing? All framing is relative so what is this relative to?

Let’s leave with one more. Historically category five hurricanes hit Beach City once every hundred years. What chance is there for a storm of that level in the next thirty years?

Other examples for our intuition: Birthday Bet, Simpson’s Paradox.

Also, this thinking and these riddles are courtesy of Michael Steiner’s podcast appearances. Sign up for Listen Notes and search him out. I enjoy [The Pathless Path](https://lnns.co/lGC0UYZr47A) & The Derivative.

## White water white wash

We like things we are good at and we are good at things we intentionally practice.

We practice better numeracy through examples like A+ BS HSA rates. The point there was that organizations choose favorable framing in absolute or relative numbers.

Numbers are just characters in a story.

LoTR has Frodo. Stranger Things has Eleven. Batman has Batman. Tim Harford’s advice for better numeracy is to ask who is telling me this story and why are these the characters?

A clever example comes from the August 2022 episode of Acquisitions Anonymous where the hosts discuss a Vancouver white water rafting company. Business pitches use numbers to tell a story about why a business is worth a lot of money. Like a job interview or a date, it’s a polished version.

This particular pitch used a blended SDE (seller’s discretionary earnings) multiple. Rather than value the business on elevated 2021 numbers, the seller’s broker included 2019 & 2018. That is a wolf in sheep’s clothing and co-host Bill D’Alessandro pulled away the mask.

Yeah, Bill begins, blended earnings are often good but the 2021 Covid-19 bump is so large it pushes the weighted average higher than any other year. Pre-Covid-19 the SDE was around three-hundred-thousand-dollars but the weighted average is over five-hundred.

Averages, weighted or otherwise, work best with distributions like number of autos owned, Wordle guesses, or years of school.

Averages, weighted or otherwise, work terribly with distributions like financial wealth, number of testicles (an average of one), and movie revenue.

Stories work best with coherent characters. Number stories work best with coherent calculations. We are experienced with stories about people. We stop books, leave theaters, or stream something else if we don’t like the way things fit together.

We’ve so much less experience with numbers.

But now we have a little more. Thanks Bill.

## 1 math trick for better predictions

Warning, this is “I watched one YouTube video” level of expertise. Also, some graphs have truncated y-axis.

Predictions are fun. Will a dice roll four or greater? Will it rain tomorrow? Will this company be worth more money tomorrow, next month, next year? An event does or doesn’t happen. We get to predict an outcome.

If an NFL team wins six of their first seven games how many games will they win in total? Well 6/7 is ~85%, and there are seventeen games therefore they’ll win ~14.5 games. But in 2021 there was a team that won six of their first seven games and one math trick could predict it.

Pierre-Simon Laplace gives us the “rule of succession”. That sounds complicated but it’s simple: For any number of outcomes add one to the observed cases and two to the total cases.

Here are four coin flips: heads, heads, tails, heads. The observed rate for heads is 0.75 (3/4). The ‘Laplace’ rate for heads is 0.66 (4/6). Laplace’s addition shifts predictions away from ‘never’ and ‘always’. This is the secret. ‘Never’ and ‘always’ are rare for sequential events.

Here is what the Laplace rate looks like compared to the observed rate for eighteen coin flips.

Here is what the Laplace rate looks like compared to the observed rate for the “six of the first seven” football team, the 2021 Tampa Bay Buccaneers.

Laplace starts at .500. Tampa wins six of their first seven games (.857) but Laplace only increases to .777. Their final winning percentage was .764.

Then there’s the 2021 Detroit Lions, a team that lost their first eight games.

The Laplace rate doesn’t know anything. It doesn’t know coins are 50/50. It doesn’t know about Tom Brady. It doesn’t know the Lions are bad. It’s just a formula that slowly adjusts to extreme events.

Laplace (b. 1749- d. 1827) didn’t have the NFL, so he made predictions about something else, the sunrise. The observed rate is 1.00. The Laplace rate, after 10,000 observed sunrises, is 0.99990002. So you’re saying there’s a chance?

No. That’s a simple wrinkle. Laplace called the sunrise a special “phenomena” which “nothing at present moment can arrest the course of.”

Coin flips, dice rolls, and drawn playing cards are random and have an expected rate.

Sunrises are special phenomena and Laplace’s rate is less helpful.

Football outcomes are a mix. They’re like the sunrise, in that teams have inherent principles. They’re like coin flips in that predictions are difficult, a sign of randomness.

Math helps: relative vs absolute saving rates, people live longer the longer they live, what the mean age means, the vaccine friendship paradox, how many ants long is Central Park?, or how many rolls of toilet paper do the residents of Columbus Ohio use in a week?

Math can be simple. Technique (add one to the numerator, add two to the denominator) and a bit of explanation (extreme events are rare without explanatory phenomena) is all we need.

## Simpson’s Paradox

Alice farms carrots and corn. She plants 10 carrots (harvests 90%) and 100 ears of corn (harvests 75%).

Bob farms carrots and corn. He plants 1,000 carrots (harvests 85%) and 100 ears of corn (harvests 70%).

Though Bob harvests a lower percent of carrots and corn, than Alice, his total harvest is higher. This is Simpson’s Paradox.

Wikipedia has examples of Simpson’s Paradox: UC Berkeley gender bias and batting averages, but it’s farming that grows my insight. Picture Alice with her backyard garden. She has a two acre lot. There’s a house, a shed, and maybe a pond. For her 110 seeds of carrots and corn Alice probably has some raised beds. Now picture Bob’s homestead with a thousand carrot seeds.

Another example.

There are two girls in the same high school, Kim and Abby. They take English 101, but with different teachers. Kim does well, earning 88/100 on the homework portion and 80/100 on the exam portion of the class, and Kim’s final grade is 84%.

Abby’s teacher assigns a lot(!) more homework. She earns 860/1000 on the homework portion and 75/100 on the exams, and Abby’s final grade is 85%. Kim did better on each section, but Abby’s final grade was higher.

Okay. Put a pin in that paradox and consider this:

Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

Which is more probable: Linda is a bank teller [or] Linda is a bank teller and active in the feminist movement?

Okay. Last one.

There are 100 persons who fit the description above (that is, Linda’s). How many of them are:

• Bank tellers? [___] of 100
• Bank tellers and active in the feminist movement? [___] of 100.

What we’ve done is reframe problems. We’ve changed the story around the numbers and our understanding.

Simpson’s paradox is easier to understand thinking geometrically in terms of farm space or using familiar examples like school. Linda, via the conjunction fallacy, is easier to understand in percentages rather than absolutes. Rory Sutherland suggests solving life’s problems like sudoku puzzles. Look at it from one direction but if that doesn’t work try another. Use your current experience to find a future answer.

## Enhanced Savings Rates

This is from our HSA. It’s good copywriting. ‘5X’ is easy to understand. ‘You may be missing out’ is great too.

The chart excels as well. It’s easy to understand and those Enhanced Rates do look bigger. They look bigger because of level one numeracy.

We level one think all the time. It’s knee jerk and first blush. We see something and some combination of evolution and experience fit what we see with what we know. Big red ‘Sale’ signs are examples. We first compare the sale price with the previous price rather than the item’s intrinsic value. This makes sense as our first reaction is immediate, requires no additional effort, and is something we are used to doing because it mostly works just fine.

The posts here, about average, focus on this idea too. Average is easy to compute and conveys certainty about an uncertain (often heterogeneous) world. Average is level one numeracy but we can do better.

One way to get past this reactionary thinking is to change the what we know part of our lives. Books like The Data Detective (2021), How to Lie with Statistics (2010), Fooled by Randomness (2008), and Factfullness (2018) are wonderful.

A fast fix comes from Sir David Spiegelhalter. Don’t look at relative comparisons, look at absolutes. Rather than the relative rate, look at the dollar difference.

That’s what I did.

If someone saves the \$2,000 in an ‘enhanced’ HSA account they have sixteen more dollars after twenty years. A lot of years for not a lot of money. For accounts of ten-thousand dollars, the difference is almost eight hundred dollars (\$11,543 vs \$10,745). Fine, a Series I Savings Bond accrued that same dollar value in six months.

The don’t look at relative comparisons, look at absolutes is a good starting place – but there are further levels.

First is to think about the costs. The enhanced HSA rates are an annuity, likely with some new terms. There’s the switching costs too. That’s a potential headache and unwanted contract in exchange for not much money. We will pass.

Actual health rather than health savings is different.

For people 25-34, their chance of dying from Covid-19 is about the same as pulling the ace of spades from a shuffled deck – twice in a row.

For people 55-64, their chance of dying from Covid-19 is about the same as flipping heads eight times in a row.

For people 75-84, their chance of dying from Covid-19 is about the same as pulling any heart from a random deck of cards – three times in a row.

Those are low absolute risks but seriously consequential.

The world is complicated and messy. Not only that, but it changes too. Numbers are helpful, but we have to ask the right questions to start.

The Covid-19 odds are rough estimates. There are about forty million people in any ten year age group. The number of deaths in the 25-34 group is 11,451. I divided the deaths by size of the group to get the percent chance of death. Odds are multiplicative, three heads in a row are 12.5%, 0.5*0.5*05. Two ace of spaces are one in fifty-two times one in fifty two, or about 0.04%.

## An hour wait at Disney

You are at Disney. You planned ahead. You open Touring Plans (theme park visit optimizing software). Here’s how the creator thinks about what you see.

“Let’s say we were trying to predict what (wait time) Disney is posting for Rock ‘n’ Roller Coaster at Disney’s Hollywood Studios. We have to do two things. We have to predict the number Disney is going to show on the wait time sign in front of the ride, and then we have to predict the actual wait time. We have to predict both because if we showed only our expected wait time and you walk up to the ride and see the posted wait time is sixty minutes and we are predicting five minutes, then you won’t believe our number.” – Len Testa, Causal Inference, October 2021

Like weather reports or repair times, Testa and his team generate more value by being less accurate.

In the fall of 2021 Disney released a similar feature to what Testa created: Genie. It will be an interesting TiVo problem .

## \$1 Toronto real estate

Average is like my reciprocating saw: never as useful as I expect. Part of the reason average sticks around is economics, It’s cheap to produce.. Average is a crude tool, like with student loan debt, and often hides the heterogeneity of a situation.

We’re entering an era of precision. One covid lesson has been the effect size of heterogeneity. At the macro level, the impact of covid depends on time and place. At the micro level, the impact depends on age, immunity, and social network. Covid was (is?) difficult to judge because there are a lot of factors that need fitted together.

If we need precision we should probably think about distributions at least as often as we think about averages. An example is the periodic one dollar real estate listing. Yes, this generates attention, spins up the market mechanism, and might be the marketing magic an owner needs. But it also changes the distribution of offers without changing the average offer.

“When you give people a listing price they ask if it’s worth more or less and by how much, so they anchor at the listing. If you don’t have an anchor people build a valuation from first principles. The average (offer) doesn’t change but the distribution does. For a one dollar listing you get some really high rates and some really low ones. In the listing price you get distributions around what the asking price was. This is a world where the seller doesn’t care about the average, they only care about the top end of the distribution.” – Dilip Soman, The Decision Corner, October 2021

Maybe this is being too hard. Average, like the saw, has its uses. The aim here is to combine numeracy with psychology to get by in the world. That means presenting the ‘best’ wait times or predicting rain more often. Being numerate is understanding that the average age is 78, but if you make it to 65 you’ll probably live well past eighty.

“The average looks like 10-12 years lost due to Covid – but that’s an average of a distribution with a very odd shape, a highly skewed distribution, some people have lost forty years of life. The peak of the distribution is people who lost less than a year of life.” – David Spiegelhalter, Risky Talk October 2021