Create a 'token budget' middleware that monitors usage across multiple providers and alerts users before quotas hit. It should enable seamless switching to local models when limits are nearing.