Reza Sedghi, Robin Schiewer, Anand Subramoney, David Kappel
View original ↗Develop a lightweight, modular library for tree-structured routing in transformer MLP blocks. This implementation should allow users to swap out standard dense layers for these sparse counterparts to save compute during inference.
Suggested repo: tree-mlp
"Conditional computation for transformers without the router overhead."
Estimated effort: 40h