Bo Yu, Cheng Yang, Dongyang Hou, Chengfu Liu, Jiayao Liu, Chi Wang, Zhiming Zhang, Haifeng Li, Wentao Yang
View original ↗Build an execution-aware benchmark for GIS-augmented agents. Focus on multi-step geospatial workflows where dynamic runtime feedback is required.
Suggested repo: geoEval
"Move beyond text-only benchmarks; evaluate spatial reasoning agents in live GIS environments."
Estimated effort: 70h