Wentao Hu, Yanbo Zhai, Xiaohui Hu, Mingkuan Zhao, Shanhong yu, Xue Liu, Kaidong Yu, Shuangyong Song, Xuelong Li
View original ↗Build a plugin for existing MoE frameworks to implement counterfactual routing, specifically to address the dormancy of expert nodes for long-tail queries. This helps improve the quality of responses in sparse models.
Suggested repo: wake-moe
"Prevent hallucinations by waking your dormant MoE experts."
Estimated effort: 50h