LLM Basics Lesson 3 of 3

Hallucination

What this lesson teaches

LLMs sometimes produce confident-sounding but completely wrong information. This is called "hallucination." Understanding why it happens is essential for working with AI safely.

Why LLMs hallucinate

Remember: LLMs predict the most likely next token. They don't have a concept of "true" or "false"—they only know what patterns look plausible based on training data.

Key insight: When an LLM doesn't know something, it doesn't say "I don't know." It generates what looks like a reasonable answer—because that's what a reasonable answer would look like in text.

Common hallucination types

  • Invented APIs: Functions or methods that don't exist but sound plausible
  • Fake citations: References to papers or documentation that were never written
  • Wrong file paths: Suggesting files or directories that aren't in your codebase
  • Confident errors: Stating incorrect facts with full confidence

How to spot hallucinations

Watch for these warning signs:

  • Very specific details you didn't provide (names, numbers, paths)
  • Claims about your codebase without having read the files
  • API calls or library functions you've never seen before
  • Overly confident explanations for things that seem uncertain

Rule of thumb: If the model gives you specific information you didn't provide—verify it. Especially file paths, function names, and external references.

How to prevent hallucinations

  • Provide source material: Give the model actual code to work with, not descriptions
  • Ask it to verify: "Check if this file exists before modifying it"
  • Constrain the task: Smaller, specific tasks = less room for invention
  • Request citations: "Show me where in the code you found this"

With Claude Code specifically, using tools like Read and Glob to actually examine your codebase reduces hallucination significantly—because the model is working with real data, not guessing.

Key Takeaways

  • Hallucination = plausible-sounding but wrong output
  • LLMs don't know when they don't know—they just generate what seems likely
  • Always verify specific claims (file paths, APIs, facts)
  • Provide real context to reduce guessing