Build a streamlined OCR-to-Markdown engine specifically tuned for document structure extraction. This should be optimized to work locally on commodity hardware without heavy GPU requirements.
Suggested repo: doc-flow
"Turn messy PDF layouts into structured data, instantly."
Estimated effort: 70h