Statistical Thinking

Statistics: Are you Bayesian or Frequentist?

The fastest way to diagnose your statistical alignment

Cassie Kozyrkov
TDS Archive

--

What if I told you that I can show you the difference between Bayesian and Frequentist statistics with one single coin toss?

Before we go any further, the demonstration works best in video form, so don’t read the summary and spoilers below until you’ve seen it. In case some terms are unfamiliar, I’ve linked to friendly explanations to help you out.

Why these cat pics? On the left, it’s all about perspective. On the right, it’s all about quantities that don’t move around. But mostly, I needed something to shield your eye from the spoilers below until you’ve seen the video.

Summary

In the video, there’s a moment where I ask you, “What is the probability that the coin in my palm is up heads?” The coin has already landed, I’m looking at it, but you can’t see it yet. The answer you give in that moment is a strong hint about whether you’re inclined towards Bayesian or Frequentist thinking.
Frequentist: “There’s no probability about it. I may not know the answer, but that doesn’t change the fact that if the coin is heads up, the probability is 100%, and if the coin is tails up, the probability is 0%.”
Bayesian: “For me, the probability is 50%! For you, it’s whatever it is for you.”

It is only by insisting that the parameter is not a random variable (Frequentist) that it makes any kind of sense to talk about your method’s ability to deliver the right answer. As soon as you let the parameter be a random variable (Bayesian), there’s no longer any notion of right and wrong. There’s only your personal perspective.

Frequentist: The parameter is not a random variable.
Bayesian: The parameter is a random variable.

One word, huge difference. Let’s take a closer look.

Frequentist versus Bayesian

Which words tell you who you’re dealing with?

What jargon tells you that you’ve stepped into their territory?
Frequentist: confidence interval, p-value, power, significance
Bayesian: credible interval, prior, posterior

What are their goals?

What are they using statistics to change their minds about?
Frequentist: actions to take (default action, see this explanation)
Bayesian: opinions to have (prior belief)

What is the main difference?

Frequentist: the parameter is a fixed quantity (no probability about it)
Bayesian: the parameter is a random variable (no right answer)

What’s in it for you?

What do you gain by joining their way of thinking?
Frequentist: it makes sense to talk about your method’s quality and “getting the answer right”
Bayesian: intuitive definitions, e.g. credible intervals are what you wish confidence intervals were (but aren’t!)

What do you give up?

What do you lose if you choose their side?
Frequentist: the core concepts are harder to wrap your head around (e.g. p-values and confidence intervals have counter-intuitive, wordy definitions) and lazy thinkers make a hash out of them frequent-ly.
Bayesian: you lose the ability to talk about any notion of “right answers” and “method quality” — there’s no such thing as statistically significant or rejecting the null. There’s only “more likely” and “less likely” …from your perspective.

If there’s no such thing as a fixed right answer, there’s no such thing as getting it wrong.

So, which one is better?

Wrong question! The right one to choose depends on how you want to approach your decision-making. For example, if you have no default action, go Bayesian. Without a default action, the Frequentist approach is less practical than the Bayesian approach unless you have special philosophical reasons for invoking the concept of TRUTH in your calculations.

(Note: those last three words are important. We’re not talking about the concept of truth in general, but rather about how it’s handled in the math that powers these approaches to statistics. The distinction between the two camps boils down to whether you treat the parameter of interest as a fixed constant or not.)

Okay… so which one is more objective?

Neither! They’re both based on assumptions, so they’re fundamentally subjective. See this article about assumptions.

The key difference is how they assist decision-making once the decision context has been framed.

Wait, what about sample size? Isn’t Bayesian the way to go with small data?

If you’ve been hanging out with the “Frequentist if there’s lots of data, Bayesian if there isn’t” folks, you might be sold on the idea that you should let sample size decide which camp to go with. Alas, the reasoning behind their advice gets wobbly if you poke it.

Yes, it’s true that Frequentists spurn babby datasets. If you’ve got more fingers than examples, they’ll almost surely tell you not to bother. (Learn more here.)

Yes, it’s true that if you take a Bayesian approach, you can proceed with as little as one (!) datapoint. The math checks out. Sure. You can do it.

…But *should* you?

Being allowed to proceed with a pittance of data might be a bug instead of a feature. There are circumstances where you definitely don’t want to be doing that. (Statistics isn’t alchemy. We’re not making gold out of thin air. There’s the same amount of data in one datapoint no matter which school of thought you pledge fealty to.)

Being allowed to proceed with a pittance of data might be a bug instead of a feature.

The way to “require less data” is to make bigger assumptions (this holds for both philosophies)… so do take a moment to ponder the nutritional content of your conclusions when your main ingredient isn’t data but, essentially, some nonsense you made up. If you take yourself too seriously when working with tiny data, expert Bayesians and Frequentists alike will forget their differences long enough to join in a belly laugh at your expense.

Cassie, you’re killing us here. Are you Bayesian or Frequentist?

Both! I choose based on how I’m framing my decision-making. It depends on whether the situation calls for choosing between actions or forming an evidence-based opinion.

Should I pick a side?

I’d advise against committing to just one camp (unless you’ve spent a few years thinking about the philosophy of statistics and you’re willing to die on this hill).

Honestly, it’s a little silly to declare yourself as one or the other unless you’ve pondered them very deeply. Having had the pleasure of doing my graduate work at Duke University (which is to Bayesian statistics approximately what the Vatican is to Catholicism), I noticed that the loudest loudmouths about the superiority of Bayesian statistics aren’t the professors… it’s the newbie students who are relieved not to have to memorize the definition of the weird Frequentist confidence interval anymore (the Bayesian credible interval is so much easier). The profs understand that “better” depends on why you’re trying to do. They spend a lot of time thinking in the Bayesian way because it fits the kind of decision approaches they’re interested in. So, my advice? Don’t pick a side. See them as two different approaches that fit two different styles of decision-making and reasoning, then leave yourself the option of using whichever one suits the mindset/context you find yourself in.

Learn more

Disclaimer

The purpose of the video demo is to have students confront their feelings about fixed vs random variables and to give them a clue about why newcomers to frequentist statistics have so much trouble with misinterpretations of p-values and confidence intervals. It does not teach how Bayesian stats is carried out in practice (and what the distribution is over), so when I do this in class, I immediately move from this to the concept of the prior. (If you want to nudge me to prioritize writing that up for you, retweets are the quickest way to my heart.)

Thanks for reading!

As always, what you do decides whose voices your community will hear. Please share good and useful writing on social media so it can rise above the rubbish. Inaction is the best way to kill an article. (Oh, and did you know that Medium lets you hit the like button up to 50 times for a standing ovation?)

If you want to say thank you, I appreciate shares and retweets. If you’re keen to read more of my writing, most of the links in this article take you to my other musings.

How about a course?

If you had fun here and you’re looking for an unboring leadership-oriented course designed to delight AI beginners and experts alike, here’s a little something I made for you:

Course link: https://bit.ly/funaicourse

Prefer to hone your decision skills instead of building your AI muscles? You can learn decision intelligence from me via this link to my free course:

Liked the author? Connect with Cassie Kozyrkov

Let’s be friends! You can find me on Twitter, YouTube, Substack, and LinkedIn. Interested in having me speak at your event? Use this form to get in touch.

Tangential humor

After all that, this small theatrical amusement may brighten your day.

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Cassie Kozyrkov
Cassie Kozyrkov

Written by Cassie Kozyrkov

Chief Decision Scientist, Google. ❤️ Stats, ML/AI, data, puns, art, theatre, decision science. All views are my own. decision.substack.com

Responses (48)