Yeping Jin, Jiaming Hu, Ioannis Ch. Paschalidis
View original ↗Develop a plug-and-play training loop that integrates distributionally robust optimization into standard RLHF workflows. This will help make LLMs more resilient to prompt variations.
Suggested repo: robust-rlhf
"Train models that actually hold up under pressure; stop prompt sensitivity in its tracks."
Estimated effort: 30h