Build an open-source controller that handles disaggregated LLM inference (splitting prefill/decode phases) on standard Kubernetes clusters. Current tools usually treat inference as a monolith, wasting resources.