Longsheng Zhou, Yu Shen
View original ↗Develop an end-to-end Python library that executes the Prune-Quantize-Distill pipeline in an order-agnostic, automated way. This tool would be invaluable for developers trying to shrink large models for CPU deployment.
Suggested repo: pqd-flow
"Optimize your models for the CPU: A unified pipeline for compression that actually improves wall-clock time."
Estimated effort: 50h