Session 11.1: Catching Hallucination

Course → Module 11: Quality Control & The Human Gate

Session 1 of 7

Hallucination Is Not Random

AI hallucination sounds like a psychiatric term. It is not. In production, hallucination means the model generated text that is factually wrong but presented with the same confidence as everything else. No hedging, no uncertainty markers, no red flags. The model states a false claim as though it were reciting the weather.

The first thing to understand: hallucination follows patterns. It is not evenly distributed across all output. Certain categories of claims are far more likely to be fabricated than others. Once you map those categories, you stop checking everything and start checking the right things.

AI Hallucination: Model output that contains factually incorrect information presented with no indication of uncertainty. Hallucination is not a bug in a single generation. It is a structural tendency that follows predictable patterns based on claim type and topic domain.

The High-Risk Claim Categories

Research from Lakera and benchmarking efforts like CCHall (ACL 2025) and Mu-SHROOM (SemEval 2025) consistently show that even the latest models fail in predictable areas. The categories below represent the zones where hallucination concentrates.

Claim Category	Hallucination Risk	Example	Why It Happens
Specific statistics	Very High	"73% of marketers report..."	Model interpolates numbers from partial training data
Named citations	Very High	"According to a 2023 Harvard study..."	Model generates plausible-sounding but nonexistent sources
Dates and timelines	High	"Founded in 1987..."	Temporal facts are poorly anchored in model weights
Quotes and attributions	High	"As Warren Buffett said..."	Model reconstructs plausible quotes from context, not memory
Niche domain facts	High	"The API rate limit is 500 req/min"	Limited training data for specialized topics
Causal claims	Moderate	"This caused a 40% increase..."	Model conflates correlation patterns with causation
General knowledge	Low	"Water boils at 100°C at sea level"	Heavily reinforced across training data

The pattern is clear: the more specific and verifiable a claim is, the more likely it is to be hallucinated. General statements are safe. Precise numbers, names, and dates are dangerous.

The Systematic Check Process

Checking every sentence in a 2,000-word article is not practical. Checking every verifiable claim in the high-risk categories is. Here is the process.

flowchart TD A[AI Output Received] --> B[Extract Verifiable Claims] B --> C{Claim Category?} C -->|Statistics / Numbers| D[Flag: Very High Risk] C -->|Citations / Sources| E[Flag: Very High Risk] C -->|Dates / Timelines| F[Flag: High Risk] C -->|Quotes / Attributions| G[Flag: High Risk] C -->|Niche Facts| H[Flag: High Risk] C -->|General Knowledge| I[Flag: Low Risk] D --> J[Verify via Search API] E --> J F --> J G --> J H --> J I --> K[Spot Check Only] J --> L{Verified?} L -->|Yes| M[Approve Claim] L -->|No| N[Remove or Correct] L -->|Ambiguous| O[Flag for Manual Review] K --> M style D fill:#c47a5a,color:#111 style E fill:#c47a5a,color:#111 style F fill:#c8a882,color:#111 style G fill:#c8a882,color:#111 style H fill:#c8a882,color:#111 style I fill:#6b8f71,color:#111 style M fill:#6b8f71,color:#111 style N fill:#c47a5a,color:#111 style O fill:#c8a882,color:#111

Building a Hallucination Log

Every pipeline hallucinates differently. The model you use, the topics you cover, and the prompts you write all affect where errors concentrate. A hallucination log tracks every confirmed hallucination across your production runs.

Over time, this log reveals your pipeline's specific failure modes. Maybe your setup hallucinates dates consistently but gets statistics right. Maybe it fabricates authors but nails technical specifications. The log turns general awareness into specific knowledge.

Structure your log as a simple table:

Date	Session/Article	Hallucinated Claim	Category	Correct Information	How Detected
2026-03-15	Product review batch	"Released in Q2 2024"	Date	Released Q4 2024	Manual check
2026-03-15	Product review batch	"According to TechCrunch..."	Citation	Article does not exist	Search API
2026-03-16	Industry analysis	"Market grew 23% YoY"	Statistic	Actual growth was 14%	Source verification

After 50 logged entries, you will know exactly where your pipeline lies. That knowledge is worth more than any general advice about hallucination.

Cross-Model Validation

One effective technique is querying multiple independent models with identical prompts and comparing outputs. If Claude says a company was founded in 2015 and Gemini says 2017, you have a discrepancy that demands manual verification. Agreement between models does not guarantee accuracy, but disagreement reliably signals risk.

This is not foolproof. Models share training data, so they can hallucinate the same wrong answer. But for production workflows where you need to triage thousands of claims, cross-model validation catches the most obvious failures before human reviewers touch the content.

Assignment

Take a 2,000-word AI-generated article on a topic you know well. Identify every verifiable factual claim: numbers, dates, names, events, statistics. Verify each one using search. Calculate the hallucination rate (confirmed false claims divided by total verifiable claims). Which claim categories had the highest error rate? Start your hallucination log with these entries.

Catching Hallucination