Expand LiteParse to include automatic metadata extraction (author, date, document type) to improve the quality of RAG systems. Focus on making it a drop-in 'local-first' alternative to expensive cloud parsing APIs.
Suggested repo: nanoParse
"Privacy-first PDF parsing: no cloud, no tokens, just text."
Estimated effort: 45h