Develop a middleware that dynamically optimizes local LLM model weights based on available RAM. This helps users run large models on hardware with memory constraints.