Vladimer Khasia
View original ↗Implement the 'Ghost Backpropagation' technique to enable training on consumer hardware with massive sequence lengths. This is a high-value performance optimization for long-context models.
Suggested repo: ghost-torch
"Scale your models 10x with zero memory penalty."
Estimated effort: 150h