Shoaib Sadiq Salehmohamed, Jinal Prashant Thakkar, Hansika Aredla, Shaik Mohammed Omar, Shalmali Ayachit
View original ↗Create a library for extracting hallucination signals directly from hidden states without needing external verification. This allows for zero-cost, runtime hallucination monitoring.
Suggested repo: honestState
"Know when your model is lying, without asking it to check again."
Estimated effort: 80h