Create an open-source middleware that maps standard vision-language model outputs to motor control signals for budget hardware. Focus on simplifying the stack so anyone can build a vision-capable home robot.