Hongjian Zou, Yue Ge, Qi Ding, Yixuan Liao, Xiaoxin Chen
View original ↗Create a dataset preparation tool that optimizes knowledge density by ranking multimodal samples based on information complexity. This helps devs build smaller, more efficient MLLMs.
Suggested repo: densityScale
"Quality over quantity for multimodal scaling."
Estimated effort: 50h