Zihan Wang, Chi Gui, Xing Jin, Qineng Wang, Licheng Liu, Kangrui Wang, Shiqi Chen, Linjie Li, Zhengyuan Yang, Pingyue Zhang, Yiping Lu, Jiajun Wu, Li Fei-Fei, Lijuan Wang, Yejin Choi, Manling Li
View original ↗Build a stability monitor for agentic RL that detects 'template-based' reasoning collapse. Developers should create an observability tool that tracks input-dependency in model reasoning to prevent silent failure modes.
Suggested repo: reason-guard
"Detect when your agent stops reasoning and starts memorizing."
Estimated effort: 25h