io.github.ArkNill/markgrab

MCPcommunity
v0.1.2io.github.ArkNillUnknownUpdated 3mo agoGitHub

Universal web content extraction — any URL to LLM-ready markdown. HTML, YouTube, PDF, DOCX.

Universal web content extraction — any URL to LLM-ready markdown. HTML — BeautifulSoup + content density filtering (removes nav, sidebar, ads) YouTube — transcript extraction with timestamps PDF — text extraction with page structure DOCX — paragraph and heading extraction Auto-fallback — tries lightweight httpx first, falls back to Playwright for JS-heavy pages Async-first — built on httpx and…

Automatically indexed from public sources. Not yet verified by the developer on Forge.Claim this listing →
3mo agoLast update
Package
Authorio.github.ArkNill
LicenseUnknown
Version0.1.2
Sourcemcp-registry
Trust Status
B
60/100Good
Listed in Forge index+10/10
Publisher identity verified+0/25
Publisher: run `forge publish` from the package repo to claim ownership
Ed25519 publish signature+0/10
Included automatically when the publisher runs `forge publish`
Domain verification+0/5
Publisher: host /.well-known/forge.json on the package homepage with { "publisher": "<github-login>" }
CVE scan · clean+30/30
Static analysis · clean+20/20
npm provenance (Sigstore)+0/5
Publish from GitHub Actions with the --provenance flag
Paste into Claude Code, Cursor, or any AI assistant to fix all gaps
StatusCommunity-indexed
PublisherUnverified
SignatureUnsigned
Domain
Provenance
DependenciesNot audited
Tool surface
Security scan✓ Cleanv0.2.0 · 19d ago
EvalsNone
IndexedJun 13, 2026

Verification confirms publisher identity (repo ownership), not code safety. The security scan covers known CVEs and suspicious install scripts — it cannot prove the absence of malicious code.

About

Universal web content extraction — any URL to LLM-ready markdown. HTML — BeautifulSoup + content density filtering (removes nav, sidebar, ads) YouTube — transcript extraction with timestamps PDF — text extraction with page structure DOCX — paragraph and heading extraction Auto-fallback — tries lightweight httpx first, falls back to Playwright for JS-heavy pages Async-first — built on httpx and Playwright async APIs Optional extras for specific content types: For HTML pages, if the initial httpx…

Keywords
mcp