AI confidently inventing fake citations, wrong dates, and people who never existed. It's not a bug — it's a fundamental feature of how these models work. Here's why, and what's being done.

Hallucination sounds like a malfunction, but it's actually the model doing exactly what it was trained to do — just in a context where that behavior goes wrong.

LLMs generate the statistically most likely next token. They don't have a ground-truth database to check against. When asked about something obscure or at the edge of their training data, they **generate a plausible-sounding continuation** based on patterns. That plausible continuation is often wrong.

**Why it's hard to fix:**

- The model can't distinguish 'I know this' from 'I'm guessing this'

- Training data has noise — the model may have learned incorrect facts

- Models are incentivized to sound helpful, not to say 'I don't know'

**Active mitigation strategies:**

1. **RAG**: Ground responses in retrieved documents. If the answer is in the context, the model usually uses it correctly.

2. **Self-consistency**: Sample multiple completions, pick the most consistent answer. Reduces random errors.

3. **Chain-of-thought**: Reasoning step by step reduces hallucinations on factual tasks.

4. **Citations**: Force the model to cite sources. Uncited claims are more likely hallucinated.

5. **Constitutional AI / RLHF**: Train models to refuse rather than guess.

Recent research shows newer models hallucinate significantly less — GPT-4 hallucinates far less than GPT-3.5. But it's not solved. For high-stakes applications (medical, legal), human review is still essential.

**Key takeaway:** Hallucination happens because LLMs predict plausible text, not true text. RAG, chain-of-thought, and citations are the best defenses.

Why AI Hallucinates — and How to Fix It