← Back to blog

Two Kinds of Things an AI Gets Wrong

8 min read

Two Kinds of Things an AI Gets Wrong#

I spent a few weeks measuring where an AI confidently made things up — and it turns out "hallucination" wasn't one thing. It was two.


Every time you ask a modern AI a question, it might be right or it might be making something up. If you've used any of these tools for long enough, you've caught one doing it. A confident answer about a book that doesn't exist. A plausible-sounding source that isn't real. A legal case it invented wholesale — in 2023 two New York lawyers were sanctioned after submitting a court brief containing ChatGPT-generated citations for cases that did not exist.

We have one word for this: "hallucination." And that word makes it sound like one problem with one fix.

It's not — at least not in what I measured. I found two problems, and they look nearly identical from the outside — but they have very different shapes underneath. Telling them apart is the difference between trusting the AI when you can and catching it when you can't. (The formal write-up of the two-bucket split and the confidence-cliff measurement behind it lives in the hallucination taxonomy research post, which also has the scope limits — this split is the pattern I saw in my pipeline, not a universal theory of hallucination.)


The research project#

I spent weeks on a research prototype: a voice assistant for video games. You held down a button, asked a question — "where's the next boss?", "what beats this enemy?" — and it answered through your headset. The whole thing fell apart the moment the AI got something wrong about the game you were actually playing, because you'd immediately know. So getting the AI to not make stuff up was the central problem.

The way I tested it: I had the AI generate a structured reference — a little wiki — for 30 different games, running each through the pipeline several times across different prompt variants (about 200 generation-and-grading passes in total). Then I had a second, more careful AI grade each one. Which claims were real? Which ones were invented? How bad were the invented ones? (The run-by-run cohort structure and per-game verdicts are written up in the data appendix.)

After a lot of this, a pattern emerged that I hadn't expected.


Bucket One: The AI knows the game. It's just padding.#

For some games, the AI clearly has deep knowledge. It knows Resident Evil Village inside and out — the characters, the setting, the weapons. But when I asked it to produce a list of twelve weapons for the game, it would give me ten correct ones and then invent a couple to hit the count.

A Magnum WCX. An F2 Sniper Rifle. Names that sound exactly right for the game's aesthetic but aren't real items.

This is the first kind of hallucination. The AI has the knowledge; it just has an urge to fill the shape it's been asked to fill. If the question says "give me twelve," it gives you twelve, even if it only knows ten.

What's interesting about this bucket: the AI can tell you it's doing this, if you ask the right way. When I added a single rule to the prompt — "If you aren't certain a name is real, omit it rather than invent" — the behavior changed.

I tested this on 17 of the 30 games I'd been sampling — the overlap between the pre- and post-change runs. Four of them flipped from failing the quality check to passing it, with nothing changed except that one sentence. The AI went from padding with plausible fakes to producing shorter, honest, correct lists. Oblivion went from a 0.55 score (failing) to 0.82 (solid pass). Resident Evil Village went from 0.45 to 0.78. Silent Hill 2 Remake cleared the bar the same way, 0.55 to 0.78.

The AI knew the truth. It just needed permission to leave blanks.


Bucket Two: The AI doesn't know the game. It's guessing.#

For other games, the same prompt rule did nothing.

Doom: The Dark Ages stayed at a 0.35 score. Black Myth: Wukong stayed stuck. Returnal, Metaphor: ReFantazio, Final Fantasy VII Rebirth — all parked at the same failure level whether I used the honesty rule or not.

When I looked at the mistakes on these games, they weren't the same kind of thing as the Resident Evil Village padding. They were deeper. In Control, the AI invented a boss named "Salvador" — not a real character in that game. It conflated two different characters into one person. In Silent Hill 2 Remake, it placed a famous final fight in the wrong location.

These aren't "I know ten things, I'm stretching to twelve" mistakes. These are "I don't actually know this game — I'm producing plausible-sounding nonsense and I can't tell the difference between what I'm making up and what's real."

And this is the crucial part: no prompt rule fixes this bucket. The AI can't omit what it doesn't know it's faking, because it doesn't feel uncertain — it feels like it's answering correctly. There is no "am I making this up?" signal inside the model for content this far outside its knowledge. It's pattern-matching on what the right answer might look like based on adjacent games it knows, and delivering the pattern as if it were a fact.

The fix for Bucket Two is going and getting real information — from a wiki, from a search engine, from a creator who actually plays the game. You can't prompt-engineer your way out of an AI that's confidently making things up about a topic it doesn't know.


Why this matters for anyone using these tools#

Here's the uncomfortable part: from the outside, Bucket One and Bucket Two look identical. A confident AI answer about a book citation is a confident AI answer whether the book exists or not. The AI doesn't tell you which bucket it's in, because it doesn't know.

What you can do is reason about which bucket is more likely given the question:

  • Mainstream, long-established topic. Probably Bucket One territory. The AI knows; if it's getting something wrong, it's usually specific padding details you can spot-check.
  • Recent events, obscure topic, niche community, very specific details. Probably Bucket Two territory. The AI has weak or no real knowledge; its confidence is not evidence of correctness. Verify.
  • Edge of the AI's knowledge cutoff. Bucket Two with extra confidence. The AI knows the topic existed but may be guessing at what happened in the most recent months or years.

Practical upshot: spot-check specifically on the details that would be easy to fabricate — names, dates, quotes, citations. The general shape of the AI's answer tends to be directionally correct even when it's lying about specifics. Don't trust the specifics without verifying them yourself.


What this means for AI products#

If you're building with these tools, or evaluating a product that uses them, the two-bucket picture tells you something important: "more data" and "bigger model" don't fix both problems the same way.

Bucket One — the padding bucket — is mostly a prompt problem. It responds to carefully-worded instructions. A product that notices its model is padding and says "omit when uncertain" will see real improvements.

Bucket Two — the knows-nothing bucket — is an architecture problem. No prompt fixes it. The reliable solution is going to a source of truth at the moment the question is asked, either by fetching fresh information (web search, APIs) or by having a human expert (or the original creator) feed the AI the right knowledge. (The architectural case for fetching at runtime instead of pre-caching third-party content is laid out in the runtime-vs-harvest design pattern post.)

A lot of AI products try to paper over Bucket Two with fluent-sounding answers. Some products actually solve it by routing to a source. Most of the "feels unreliable" experiences I've had with AI chatbots come from Bucket Two situations being handled like Bucket One — confident completions over knowledge the model doesn't have.

If you're evaluating an AI product: ask it a question about something specific and obscure you actually know the answer to. Watch whether it admits uncertainty, routes to a source, or produces an incorrect but confident-sounding answer. The last one is the red flag.


What I did about it#

The practical upshot of all this, for the research project: stop trying to get the AI to generate answers about games it doesn't know. Let the AI confidently answer about the games it does know. For everything else, route to someone who actually knows — a creator who's made a walkthrough, a wiki curated by players, a search query that returns live information.

And when the AI uses someone else's knowledge to answer, show the source. Every answer that came from a specific creator's video should carry that creator's name, a link back to their channel, and a way to support them. An AI assistant that depends on creators' work to fill its knowledge gaps should route value back to those creators, not siphon value away from them.

That's the shape the research points to. The measurement work I just described is part of why. When you understand that AI knowledge splits into "things the model actually knows" and "things the model is plausibly guessing about," the right architecture becomes obvious: let the model answer confidently about what it knows. Route elsewhere for what it doesn't. And when you route elsewhere, credit the source.

A lot of AI products currently lean the opposite way — they answer confidently across the full range and don't route users back to the sources whose content made the answer possible. I make the architectural case for why that's avoidable, and what the alternative looks like, in AI That Respects Creators; the disputes currently being litigated over training data without attribution (NYT v. OpenAI, Getty v. Stability AI) are one visible symptom of the same shape. The two-bucket data in this post is part of the case for why the alternative is worth the effort: once you see that Bucket Two answers have to come from somewhere, routing to the source stops being a nice-to-have and starts being the honest design.


See also AI That Respects Creators and the time I thought "more data" would fix my AI, or the deeper research notes: hallucination taxonomy, grounding schema alignment, runtime vs. harvest, and the data appendix.

Personal blog. Views and writing here are my own and do not necessarily reflect those of my employer or any organization I'm affiliated with. Side projects, written on personal time.