Aryaman Arora, Zhengxuan Wu, Jacob Steinhardt, Sarah Schwettmann
View original ↗Build an automated interpreter for LLM attribution graphs. This moves circuit tracing away from manual inspection into an automated pipeline, allowing researchers to quickly find out *why* a model generated a specific output.
Suggested repo: trace-viz
"Explain LLM computations, automatically."
Estimated effort: 100h