Hallucination Analysis in Small Models

Evaluating factual reliability of small models on non-Western knowledge using Wikidata-grounded QA

Investigating where small language models hallucinate on factual questions about non-Western people, places, events, rivers, species, and cultural entities.

Current focus:

  • Generating factual question-answer pairs from Wikidata entity context
  • Stress-testing small models on underrepresented entities and locally important knowledge
  • Separating retrieval failure, knowledge gaps, and answer-generation hallucination in model outputs

Tech: Python, Wikidata API, OpenAI API, FastAPI, factual QA generation, small-model evaluation