Yangyue Wang, Harshvardhan Sikka, Yash Mathur, Tony Zhou, Jinu Nyachhyon, Pranav Guruprasad
View original ↗Build a GUI perturbation toolkit that automatically mutates web UI screenshots to test the robustness of multimodal grounding agents. This is critical for reliable browser-automation agents.
Suggested repo: gui-stress
"Is your browser agent actually robust or just memorizing DOMs?"
Estimated effort: 50h