arXiv9h ago

Benchmark for Assessing Olfactory Perception of Large Language Models

Eftychia Makri, Nikolaos Nakis, Laura Sisson, Gigi Minsky, Leandros Tassiulas, Vahid Satarifard, Nicholas A. Christakis

View original ↗

Analysis

Viral velocity

low

Implementation gapYES

Novelty8/10

Categorypaper

Topics

reasoningmultimodalevaluation

Opportunity Brief

Develop an evaluation suite for odor reasoning tasks to test LLM sensory grounding. This allows for benchmarking models on non-textual human sensory experience.

Suggested repo: olfact-eval

"Can your LLM actually smell? A benchmark for olfactory reasoning."

Estimated effort: 20h