Build a universal quantization bench that tests various llama.cpp backend settings across different hardware. Automate comparison reports for common model architectures.
Suggested repo: quant-bench
"The ultimate benchmark tool for your localized LLM stack."
Estimated effort: 25h