Timothy B. Higgins, Antonios Mamalakis, Chirag Agarwal
View original ↗Build an evaluation framework for multimodal models that processes chaotic scientific data (weather). This allows researchers to verify model generation quality for physics-heavy tasks.
Suggested repo: synoptic-bench
"Can your VLM predict the weather or just write about it?"
Estimated effort: 50h