PDF-to-Markdown router. Per-page backend selection + confidence scoring for RAG ingestion.
Self-healing PDF extraction with per-page confidence scoring. Open-source LlamaParse alternative for RAG pipelines, MCP server for Claude Desktop, LangChain + LlamaIndex loaders. Ranked #2 on opendataloader-bench (0.900). The only PDF extractor that audits its own output. Catches blank pages, scrambled columns, broken tables — re-extracts them with a stronger backend. So your LLM gets clean data,…
Verification confirms publisher identity (repo ownership), not code safety. The security scan covers known CVEs and suspicious install scripts — it cannot prove the absence of malicious code.
Self-healing PDF extraction with per-page confidence scoring. Open-source LlamaParse alternative for RAG pipelines, MCP server for Claude Desktop, LangChain + LlamaIndex loaders. Ranked #2 on opendataloader-bench (0.900). The only PDF extractor that audits its own output. Catches blank pages, scrambled columns, broken tables — re-extracts them with a stronger backend. So your LLM gets clean data, not silent garbage. Routes each page to the best of 5 rule-based backends + BYOK LLM fallback…