Jon-Paul Cacioli
View original ↗Implement the Metacognitive Monitoring Battery as an open-source evaluation suite. This would provide the community with a standardized way to measure if their agents actually know when they are hallucinating.
Suggested repo: metacheck
"Does your agent know it's lying? Test its metacognitive awareness."
Estimated effort: 30h