Create an inference-time scheduler that dynamically allocates token budgets based on task complexity. This is the key to solving the 'overthinking' problem in modern chain-of-thought models.