Data mixture optimization is the new frontier for midtraining multimodal LLMs. Develop a library that translates benchmark targets into specific training data distributions.