Beyond "I Don't Know": Teaching LLMs Epistemic Humility
In January 2025, researchers at Mount Sinai hospital tested six leading language models on a simple but crucial medical task: identify fabricated details embedded in patient vignettes. The results were alarming. Across 300 physician-validated cases, hallucination rates ranged from 50% to 82%. DeepSeek’s model hallucinated 82.7% of the time. Even the best performer, GPT-4o, failed half the time. But here’s the truly dangerous part: when these models were wrong, they were more confident. An MIT study from the same month discovered that AI models use phrases like “definitely,” “certainly,” and “without doubt” 34% more often when generating incorrect information than when providing factual answers. ...