Lin Mu, Haiyang Wang, Li Ni, Lei Sang, Zhize Wu, Peiquan Jin, Yiwen Zhang
View original ↗Build a communication-aware MoE-LoRA framework that prevents expert dominance in training. This library will allow developers to train massive, highly-efficient LLMs on commodity hardware without expert collapse.
Suggested repo: talklora
"Stable MoE-LoRA training with communication-aware routing."
Estimated effort: 40h