Zhengqing Yuan, Hanchi Sun, Lichao Sun, Yanfang Ye
View original ↗Implement the MegaTrain framework to enable training large models on consumer-grade hardware. This democratizes the training of giant models at full precision.
Suggested repo: mega-trainer
"Train 100B+ parameter models on a single GPU using host-memory streaming."
Estimated effort: 200h