Learning AI Out Loud: EP 2 The Devil Wears ___
Your brain and your chatbot are running the same trick
The Devil Wears ___.
What word popped into your head as you read that phrase? Read it again. By the time your eyes hit the underscore, your brain had already produced an answer, weighed it against the alternatives, and ranked Prada above the rest. If I’d written “The Devil Wears Gucci”, “Hermes”, or “H&M” your auto-complete mode would have rejected each one before the period.
That auto-complete your brain just pulled off, with almost no effort, is the same trick your AI chatbot runs every time it answers you. It’s based on a 250-year-old idea called Bayes’ theorem, and once you can see it, three things click. You’ll build an intuition for what’s happening when your chatbot answers you. You’ll understand why it sometimes confidently lies. And you’ll catch your own brain doing the same thing, all day, every day, for free.
From words to tokens to prediction
Last week we cracked the first part of the AI illusion: your sentence becomes a string of numbers called tokens. That’s how the model reads your prompt.
As a quick refresher, if you run “The Devil Wears” through OpenAI’s tokenizer, it splits into four tokens with IDs that look something like [976, 76102, 486, 36108]. (The exact numbers depend on the model, but the principle is what matters: words in, numbers out.)
But reading is only part of the job. Once the model has its tokens, it still has to give you an answer and predict the next word, then the next, then the next. Before we get into how models do that prediction, it helps to build an intuition for what prediction even means when there isn’t a single right answer.
“The Devil Wears ___” doesn’t have a true answer. It has a distribution of likely ones, weighted by what the model has seen before. In our toy model this is heavily biased toward “Prada”, because “Prada” sits at the end of that phrase in roughly a billion places on the internet. Models build that weighting from their training data. You build yours from movies, songs, headlines, conversations, general life experience. Same idea. Different source.
To feel what that means, you need to first notice that you do this too.
Run the experiment on yourself
Three fill-in-the-blanks. Don’t overthink them — just notice the word that appears in your head before you’ve decided to think.
1. Just keep , just keep
Swimming
Running
Going
If the blank felt obvious, ask yourself why. If I told you the words come from Dory in the movie Finding Nemo, does that change what you’d pick? Notice how context and what you already believe begin to matter. A model trained mostly on post-2003 English picks Swimming. A 1950s-trained one picks Going.
2. In a galaxy far, far ___ (one for a slightly older generation)
Away
Beyond
Gone
The phrasing is so specific that the prediction barely wobbles. This is what your chatbot “feels” when you give it a well-known prompt: near-total confidence that one token (or word) comes next.
3. Kanye West is ___
a visionary
problematic
a genius
who?
This one splits readers. There’s no shared answer. The word you picked says more about you than it says about Kanye — specifically, what you’ve observed about him over the years.
And here’s the punchline of the whole experiment: your prior beliefs about the world shape every prediction you make. A reader who only knows Kanye from his early albums picks differently from a reader who only knows him from recent headlines. Same blank, same evidence, different priors, different prediction.
A chatbot trained mostly on music-criticism sites would pick differently from one trained mostly on tabloid news. Same blank, same evidence, different priors, different prediction.
Same engine. Different fuel.
Which brings us to the natural next question: if priors and evidence are doing all this work, can we actually write down the recipe?
Naming the recipe: Bayes’ theorem
What just happened, in your head, has three moving parts that fire in a specific order, very fast.
First, you had a prior: everything you’d absorbed before reading the prompt. Every text message, every Kanye headline, every Pixar film. Essentially all the pop culture you’ve absorbed through daily experience, accumulated over time.
Then, you received evidence: the prompt itself, the specific words on the screen. Kanye West is ___.
And your brain produced a prediction by combining the two. Not by looking up a stored answer. By weighing the new evidence against the old prior and picking the most likely completion.
Prior plus evidence equals prediction. That’s the engine. It runs in your head, and it runs in ChatGPT.
A small caveat for those deeper in the science: the real architecture behind frontier models is more complex (i.e., transformers, attention, optimisers) and isn’t strictly Bayesian. But what stays consistent is the methodology: predict the next word, given everything seen before. We’ll build from this simple mental model toward the real architecture in future episodes.
Let’s make it real. As mentioned above the model sees your prompt as tokens. Now add the other two ingredients.
The Devil Wears → [976, 76102, 486, 36108] (your prompt, as the model sees it)
Prior = the model’s training data. Every webpage, book, and forum post it has ever read, compressed into a giant weighted map of “what usually follows what.”
Evidence = your prompt, in token form.
[976, 76102, 486, 36108].Prediction = the most likely next token. The model scores every option in its vocabulary, finds that
170900(Prada) sits far above the rest, and outputs it.
The formal name for this recipe belongs to a Presbyterian minister called Thomas Bayes, who worked it out in the 1700s but never published it. His friend Richard Price rescued the manuscript from his desk drawer after he died, and the Royal Society published it in 1763. We now call it Bayes’ theorem. The mechanics don’t matter for our purposes but the recipe does. Prior, evidence, prediction. Weigh, combine, predict.
Why your chatbot confidently lies
Notice that when the model picked Prada, it didn’t tell you how confident it was. It might have scored Prada at 65% likely, with Gucci at 10% and a long tail of other options below. You only ever see the winner. The chatbot’s voice sounds equally certain whether it’s at 99% or 25%.
That hidden confidence is the seed of every confident lie your chatbot tells you. Because the model picks the most likely next token and not the most true one. Likelihood and truth are not the same thing.
Ask an early chatbot “who won the 1923 Nobel Prize in Physics”. Even if it has only seen the question a handful of times in training, it will produce an answer. Confidently. Beautifully phrased. And sometimes completely wrong. Because it’s predicting which words are most likely to come next given the question. And plausible-sounding wrong often beats correct but rare in the likelihood race.
This is what people call hallucination, though confabulation is the more accurate word — it isn’t lying, it’s plausibly making things up. It’s the design working exactly as intended. It was built to be plausible, not truthful. Knowing the difference is the difference between trusting your AI and using it.
We’ll come back to how the labs are dealing with this (tools, retrieval, agents) later in the series.
What’s next
Two layers of the AI illusion now cracked. The model breaks your sentence into tokens. Then it predicts the most likely next token using something that looks an awful lot like the Bayesian instinct your brain has been running for free your whole life.
The natural next question — what’s the dumbest possible version of that prediction machine that still works? Next week we build one. It’s called a bigram model, it’s genuinely silly, and it’ll make N-grams, transformers, and GPT feel less like magic and more like a series of clever upgrades to a very simple idea.
That’s the second crack in the AI magical illusion.
If this was useful, the best thing you can do is forward it to one person who has been quietly anxious about AI. The whole point of this series is to help people stop feeling crazy about a technology they don’t understand. That gets easier when more of us understand it together.
See you next week.

