Zhonghao Zhan, Huichi Zhou, Zhenhao Li, Peiyuan Jing, Krinos Li, Hamed Haddadi
View original ↗Create a red-teaming tool that simulates malicious tool outputs (e.g., fake APIs, poisoned tool responses) to stress-test agent trust. Developers have no easy way to build 'skeptical' agents.
Suggested repo: SkepticalAgent
"Your agent trusts its tools blindly; build in the necessary skepticism."
Estimated effort: 60h