Create an open-source toolchain that automates the generation of Lego-style aesthetic videos from text using Stable Video Diffusion or similar models. Focus on consistent character generation to maintain visual continuity across clips.