1vuio0pswjnm7
View original ↗Build a CLI evaluation framework that performs 'mass-testing' of search-augmented systems for factual consistency. Focus on scalable benchmarks for enterprise RAG deployments.
Suggested repo: fact-check-suite
"Automate the hunt for RAG hallucinations at scale."
Estimated effort: 60h