io.github.NameetP/pdfmux

MCPcommunity
v1.6.4io.github.NameetPUnknownUpdated 1mo agoGitHub

PDF-to-Markdown router. Per-page backend selection + confidence scoring for RAG ingestion.

Self-healing PDF extraction with per-page confidence scoring. Open-source LlamaParse alternative for RAG pipelines, MCP server for Claude Desktop, LangChain + LlamaIndex loaders. Ranked #2 on opendataloader-bench (0.900). The only PDF extractor that audits its own output. Catches blank pages, scrambled columns, broken tables — re-extracts them with a stronger backend. So your LLM gets clean data,…

Automatically indexed from public sources. Not yet verified by the developer on Forge.Claim this listing →
1mo agoLast update
Package
Authorio.github.NameetP
LicenseUnknown
Version1.6.4
Sourcemcp-registry
Trust Status
B
60/100Good
Listed in Forge index+10/10
Publisher identity verified+0/25
Publisher: run `forge publish` from the package repo to claim ownership
Ed25519 publish signature+0/10
Included automatically when the publisher runs `forge publish`
Domain verification+0/5
Publisher: host /.well-known/forge.json on the package homepage with { "publisher": "<github-login>" }
CVE scan · clean+30/30
Static analysis · clean+20/20
npm provenance (Sigstore)+0/5
Publish from GitHub Actions with the --provenance flag
Paste into Claude Code, Cursor, or any AI assistant to fix all gaps
StatusCommunity-indexed
PublisherUnverified
SignatureUnsigned
Domain
Provenance
DependenciesNot audited
Tool surface
Security scan✓ Cleanv1.7.0 · 19d ago
EvalsNone
IndexedJun 13, 2026

Verification confirms publisher identity (repo ownership), not code safety. The security scan covers known CVEs and suspicious install scripts — it cannot prove the absence of malicious code.

About

Self-healing PDF extraction with per-page confidence scoring. Open-source LlamaParse alternative for RAG pipelines, MCP server for Claude Desktop, LangChain + LlamaIndex loaders. Ranked #2 on opendataloader-bench (0.900). The only PDF extractor that audits its own output. Catches blank pages, scrambled columns, broken tables — re-extracts them with a stronger backend. So your LLM gets clean data, not silent garbage. Routes each page to the best of 5 rule-based backends + BYOK LLM fallback…

Keywords
mcp