Build an open-source schema mapping tool that uses LLMs to normalize disparate datasets for RAG applications. Focus on a CLI tool that generates Python pipelines automatically from unstructured data.