Categories
Uncategorized

Average Lies

“Often an average is such an oversimplification that it is worse than useless.” – Darrell Huff, How to Lie with Statistics.

We don’t really think about averages. The average hospital costs for hepatitis A was $16,000 in 2017. The average student loan debt for North Carolina residents is $36,000. The average American says they’ll spend $142 on Valentine’s gifts. Men, on average of course, say they’ll spend more than women.

For some things in life, average is fine. When my daughters were born, the hospital gave us a growth chart for their height and weight. It showed deciles and right in the middle was average. Growth charts are simple. Height. Weight. Plot. On chart meant on track, physically at least.

Now my daughters are twelve and ten and wow how things changed. New parents can track their child’s sleep, diet, movement—bowel or otherwise. And it’s not just parents. Everyone can track their taken steps, hours slept, and Spotify streams.

With technology, counting is easier.

With counts, averaging is easier.

Numbers are tools. Rather than bartering bananas for bread we have dollars and cents. With numbers, stores count their bananas bundles. With numbers, people have balanced budgets.

Numbers are tools. Like other tools, they take practice with feedback to build proficiency. I’m much more careful with the occasional use of power tools than the regular use of a chef’s knife. Numbers are like that. Well practiced and well used, numbers are a unique and powerful tool.

An example of numbers telling another story was the sabermetrics revolution in baseball. Smart teams realized that walks are better than hits, and that walks cost less to buy. Worth more, cost less. It’s like the successful Miller Lite advertising campaign: ‘tastes great, less filling’.

Decades later, sabermetrics happened in basketball with the insight that making one-third of three-point shots was the same as making one-half of two-point shots. Life, like sports, uses numbers more.

Numbers, though hidden in code, will become more prevalent in life and more important. 

Average, as numbers go, is often abused. This is due to many reasons, but just like technology has reduced the cost of tracking a baby’s bowel movements, average is used because the cost is low. It’s sixth-grade math. And it can hide important nuances.

For example, the average student loan borrower owed $28,000 in 2016. If we dig a bit deeper we find:

  • The median debt was $17,000.
  • The median for two-year degrees was $10,000.
  • The median for a four-year degree was $25,000.
  • One-in-four borrowers owed less than $7,000.
  • Only 7% of borrowers owed more than $100,000.

Those details are often omitted from the story. One poll showed that people viewed median debt of $17,000 as the “least bad figure about student loans”. Life is nuanced but numbers are not. Framed influences the way numbers are understood.

Thanks for reading.

Categories
Uncategorized

Linda buys a bat and brand

There’s a quarrel in psychology research over Linda the banker. First some background. Most behavioral psychology is about crafting nearly identical situations with nearly identical composites of people who, despite the near identity, act in different ways.

One example is when employees are prompted with savings cues for their 401k. Imagine that with the annual corporate messaging about insurance, vacation adjustments, and outlook projections was a form that said “Did you know that your 401k contributions from October through December are eligible for a full employer match?” Employees who get the annual message with lines like that, raise their savings rates three percent. Employees who don’t get that message don’t change their rate.

What anyone saves is dependent on their own choices, right? However with the change in one line they aren’t.

Okay, now let’s talk about Linda.

Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

Which is more probable?

  • Linda is a bank teller.
  • Linda is a bank teller and is active in the feminist movement.

When this original research was done, most people chose the second option.

And it’s wrong.

This ‘conjunction fallacy’ goes like this: there’s no way that there can be more bank tellers who are active in the feminist movement than there are all bank telllers.

This is mathematical logic. But it’s not how people think. When people hear Linda’s story they take the contextual clues that come along with it. If we could peak inside a participants mind we might see thoughts like this, ‘If you’re telling me all this stuff about Linda then it must be true that she is both a bank teller and active in the feminist movement.’

Any information that people get, people use and numbers are a special kind of information.

Numbers carry an authority.

Home values increased.

Home values increased by 8%.

And numbers lead to fast thinking. 

In his best-selling book, Daniel Kahneman framed this idea in terms of thinking fast or thinking slow. For some things in life, Kahneman wrote, we tend to think fast. Brands are fast thinking.

pasted image 0

There’s no interpretation here.

Numbers are like brands. Though an 8% increase in home values is a complex computation of home sales, realtor surveys, incomes, and so on, we see that and think it’s true without really thinking.

Joining Linda in the pantheon of psychology phrasing is the bat and ball problem. It looks like this:

A bat and a ball together cost $1.10. If the bat costs a dollar more than the ball, how much does the bat cost?

Ok, now try it this way.

Bat + Ball = $1.10, the bat costs a dollar more than the ball.

Or, the same idea in a different way.

A Ferrari and a Ford together cost $190,000. The Ferrari costs $100,000 more than the Ford. How much does the Ford cost?

Each step down slows thinking. People see the bat and ball problem the same way they see brands or 8% increases: fast.

Most of the numbers we encounter in life is like brands, the bat and ball problem or Linda the banker—our default is to move quickly past them. But to get all the details we’ll need to slow down.

Categories
Uncategorized

Baseline data

Want a daily email with one idea? It’s quick, fun, and sometimes helpful. Sign up here.

One of the coronavirus problems, one of any system’s problems, is lack of good data. When data is precise and simple it’s just a math problem. This is why we have to gamble with coronavirus.

In mid-March I started to feel kinda ill. Did I have it? Everything pointed to yes.

I’d traveled through airports. I felt congested and achy. The news talked more about coronavirus than allergies. Wait. What? The noise of the news made me overlook the color of my car, which was a nicely tinged yellow thanks to an above average pollen count in central Florida. 

My problem was that the ‘fifth vital sign’ had overtaken all the others. Or put differently, the only data I was using was highly subjective. Instead of continuing my confoundedness I started counting. 

IMG_5490.jpeg

Regularly tracking my temperature showed nothing to worry about.

The other potential problem at the the time was toilet paper. 

Well before we were storming stores and short sheets I had stocked up. But watching the paper pandemonium I had no idea how long our stockpiles would last. So, I counted. Our  conservative count is two rolls per person per month. Prior to counting, I’d never have known.

Now do emergency funds

Good data is an objective tool to use alongside the subjective. If we kinda feel ill, we can take temperatures. If we see toilet paper rolling out of stores, we can use a rule of thumb. If we’re worried about finances, we can compare spending to savings. Good data is the base rate, our adjustments are the subjective. 

In any quantitative field three things matter: counts, computations, and communications.

Without accurate counts, we know nothing. 

Without accurate counts and computations, we infer nothing. 

Without accurate counts, computations, and communications, we do nothing. 

Sometimes we jump the gun. We build a model and share it to the world. #dataisbeautiful. Sometimes though we just need to start at the beginning and count. 

Thanks for reading. 

Categories
Uncategorized

Parlay Maths

A gambling parlay is a bet where two or more things have to happen. Will you have coffee and eggs for breakfast is less likely—thus longer odds and higher payout–than just betting on one or the other.

And people love betting parlays. The most popular Super Bowl bet is the coin toss, and Americans bet seven billion dollars (legally) on the game. 

And casinos love people betting parlays. According to UNLV, sports books earn five percent on bets, except for parlays. On those bets casinos take 30%.

Why do bettors do so poorly? It’s a little too much psychology and a little too little numeracy. Bettors, said Rufus Peabody, love to bet for things to happen. It’s easier to imagine one outcome than all outcomes. It’s why the ‘no safety’ bet almost always has positive EV. 

Bettors also don’t consider the numbers in the right light. Two independent seventy percent events only both occur half the time. Let’s run with that.

According to smart air filters, a t-shirt-mask will stop 70% of an airborne bacteria which is smaller than the coronavirus. That’s good. But what if we parlay masks?

If I wear a mask a t-shirt-mask and you wear a t-shirt mask we’ve reduced the viral load ten-fold. Thirty-percent of thirty-percent is .09. 

The same math that makes parlays good for Vegas and bad for gamblers is what makes masks good for all of us.

I wore mine to the store for the first time. It felt kinda foolish. But then I did the math.

UNLV explains the casino win percentage as “Win percentage, or win as a percentage of drop, AKA hold percentage, the percentage of money wagered that the casino kept.”

Peabody also tweeted about this: 

Categories
Uncategorized

Colossal Comprehension

IMG_5475.jpeg

This is the earth.

Part of our quarantine education was to get outside and make some scale drawings of our solar system.

We made our earth one roadway wide, about twenty feet in diameter and paced off two hundred yards and drew the moon. It was five feet wide. The ISS was seven inches from the earth’s surface.

It’s always challenging to consider the scale of the universe. It’s huge. It’s so huge that Mars was sixty miles away in our little universe.

Part-of-the-reason Einstein marveled about compound interest is because scale is really hard to understand. Once things scale up or down past the human perspective we just don’t quite get it. This came up on two recent podcasts.

First, Peter Attia spoke with his daughter about the coronavirus. It was an excellent, simple, good-for-kids episode. So how big (or little) is the virus?

“If were to cut one of your hairs, and you can barely see the edge when it’s cut, how many coronaviruses do you think we could line up on the tip of your hair when it’s cut?” Attia asked

A thousand viruses. That’s beyond the human scale of understanding.

One the other end of the spectrum, and closer to the solar system situation was Cade Massey’s longhorn lament.

“One of the things that frustrated me most when I to talk with people was them saying ‘Well, you’re not going to get this if you’re young.’ We knew the probabilities are steeply related to age but there’s still a probability for every age group. Throw millions of people at a small probability and you’ve got some sick people. We just aren’t good psychologically with these kinds of probabilities.” Cade Massey

The percentage for infection, hospitalization, and ventilation are remarkably small.

New York City houses eight million people and the metro area is home to twenty-one million. Projections note that only .27% will need beds, and only .063% will need ventilators.

Right now my sixth grade daughter is learning percentages as parts of the whole. She answers questions like; “If sixty percent of a class of twenty-four are boys, how many children are in the class?”

That’s good sixth grade math but it gets hard with large numbers. One-fourth of a percent is really small but eight million is really large. How does someone make sense of that? We probably just need to think slow, not fast.

 

Categories
Uncategorized

Numbing Numbers

On Epidemic, Ronald Klain talked about how long a shutdown may last.

“I’m asked this question when I’m on TV all the time, what’s the date, what’s the date? But this discussion about the date is the wrong discussion, the question is, what are the preconditions that we need to have in place before we can reopen large swaths of economic activity?”

That’s a harder question.

The CoVid19 situation is like a Sudoku board with very few numbers filled in. If that’s a nine this might be a four which makes that a two—shit that can’t work. There are so many interchangeable parts it’s easier to ask, ‘what’s the date?’

To get away from ‘what’s the date’ questions we can add one more small step, asking why.

‘Why’ gets us to answer.

For example, why is social distancing six feet? Is this a case like a power law where the bulk of the results come from one source? For example, when researchers looked at what size particles passed through what size fabrics, “0.02 micron Bacteriophage MS2 particles (5 times smaller than the coronavirus)“, a surgical mask stopped 89% of the particles, a vacuum bag 86%, and a cotton blend t-shirt stopped 70%. Not bad.

But when they doubled up, masks improved to stopping 89% and shirts to 71%. Small relative increases.

Is social distancing like that? Six feet is like wearing a mask made from a cotton shirt? Maybe not. The gas cloud research rather than aerosol or droplet research—the six feet origin work was done in the 1930’s—hints that viruses could travel twenty-seven feet in the air.

It’s hard to not recommend something other than ‘when we hear numbers we should ask why‘ but there’s so much ambiguity that’s all we can say with confidence. As for dealing with the here and now, here’s how to gamble with the coronavirus.

Categories
Uncategorized

Median and Average Meanings

There’s a story about Bill Gates’s wealth and height you’ve probably heard. If not, let’s quickly share it here. If Gates were to walk into a room of you and twenty-five friends, the difference in average height and median height would be small.

However, the difference in average income and median income would be large. Gates’s wealth raises the mean because it’s relatively ginormous. Even in a room of millionaires, Gate’s presence changes the average from one-million to four-billion. That’s a lot.

This story is helpful to keep in mind because averages hide nuances.

In 2016, the average student loan debt was $37,000. However, the median figure was $17,000 and one-fourth of all borrowers owed less than $7,000. Part-of-the-reason the average is so far from the median is because of graduate school, one-in-four post-graduation debt-holders had more than one-hundred thousand dollars in debt. My wife attended medical school, and I can attest to the amount.

This median approach might paint a rosier picture.

Retirement accounts show the same point, only in the opposite direction. The average balances is $100K whereas the median is $24K.

“Often an average is such an oversimplification that it is worse than useless.”

How to Lie

Average can become an If/Then word. If the average is presented, then we can attempt to find the median as well. Bill Gates will be our patron saint here. Imagine him, staning with a Diet Coke in hand, reminding you to find the median too.

Another fun part of you and twenty-five friends is “the birthday bet”.