Build a benchmark tool that seeds hallucinations by injecting fake entities into prompts to test model robustness. Developers can use this to stress-test their RAG pipelines for reliability.