Dev Tools
Gotify Commander A remote control for your servers. The first Gotify plugin that talks back.
SoundInbox Stop checking email. Start hearing what matters.

Go-Docs MCP

Install and Go — your AI reads any document

macOS Linux MIT License Open Source MCP Server

The Problem

Every document MCP server needs Node.js or Python. They only handle one format — and none of them do OCR, table extraction, or image reading.

The Solution

go install and you’re done. One binary, 12 tools. PDF, DOCX, Markdown, images, OCR — no runtime, no config.

Why go-docs-mcp?

The MCP ecosystem is drowning in Node and Python. A Go option stands out simply by being different. People who run lean infrastructure — self-hosters, DevOps, terminal users — actively prefer compiled binaries over interpreted runtimes. If you already have Go, this is the fastest path from zero to document-reading AI.

Node / TS MCPs Python MCPs go-docs-mcp
Requires Node.js Requires Python + pip Single binary, no runtime
📄 Single format only Single format only PDF + TXT + MD + DOCX + CSV + images
👁 No OCR No OCR OCR for scanned PDFs + images
📊 Limited tables Basic tables Tables + outline + images + caching
🔒 Security varies Security varies Read-only, directory-locked

Features

📄

Read Any Document

Extract full text from PDF, TXT, MD, CSV, DOCX, and images (PNG/JPG/TIFF) with page-level granularity. Your AI reads any document format from a single server.

🔍

Search

Full-text search within documents with contextual results. Find exactly what you need across hundreds of pages.

📷

Image Extraction

Extract embedded images from document pages. Diagrams, charts, photos — pulled out and ready for analysis.

🌐

URL Fetch

Fetch and read documents from URLs. Your AI grabs a document from the web, caches it locally, and reads it like any local file.

🔒

Security

All processing happens locally. No documents leave your machine. No cloud APIs, no data exfiltration risk.

Fast Caching

Parsed documents are cached for instant repeat access. First read extracts, subsequent reads are near-zero latency.

👁

OCR

Read scanned PDFs and image files (PNG, JPG, TIFF) via OCR. Automatic fallback for image-based documents, force OCR when needed.

Architecture

go-docs-mcp is a single Go binary that communicates via stdio using the Model Context Protocol. It delegates to poppler-utils, tesseract, and pandoc for format-specific extraction.

# Data flow 🤖 AI (Claude, ChatGPT, etc.) → MCP (stdio) → go-docs-mcp (~10MB) ↓ 📄 PDF (poppler) 📝 TXT/MD/CSV 🗎 DOCX (pandoc) 📷 Images (tesseract OCR) ↓ Parsed → cached → AI

12 tools in 5 categories: Discovery (2), Reading (3), Search (1), Analysis (4), OCR (2).

12 MCP Tools

A comprehensive set of tools covering multi-format document reading, search, and extraction.

ToolDescription
list_documentsList all documents in the configured directory with format detection
read_documentRead full text or specific pages from any supported document
search_documentSearch within a document for text with contextual results
get_document_summaryGet a summary of the document structure and content
get_document_metadataExtract title, author, dates, page count, and format info
get_document_outlineExtract document outline — headings, TOC, structure
extract_tablesExtract table structures from documents
extract_imagesExtract embedded images from document pages
read_urlFetch a document from a URL, cache locally, and read it
ocr_documentForce OCR on scanned PDFs or image-based documents
read_imageOCR standalone images (PNG, JPG, TIFF)
list_formatsShow supported formats and installed dependencies

Quick Start

Install with one command, configure in 30 seconds.

# 1. Install go install github.com/drolosoft/go-docs-mcp@latest # 2. Add to your MCP config (Claude Desktop, etc.) { "mcpServers": { "docs": { "command": "go-docs-mcp", "env": { "DOCS_MCP_DIR": "/path/to/your/documents" } } } } # 3. Restart your AI client — done

Requirements

A Go toolchain for installation. Format-specific deps are optional — install only what you need.

Built With

A single Go binary with optional dependencies per format.

  • Go 1.25+ — Single binary, cross-platform compilation
  • MCP SDK (Go) — Model Context Protocol via stdio
  • poppler-utils — PDF text, image, and metadata extraction
  • tesseract + pandoc — OCR for images/scans, DOCX conversion
Gimme a coffee 😋
Buy Me A Coffee

I build tools I wish existed, then give them away.
If one of them saved you time, a coffee keeps the next one coming.

Buy Me A Coffee QR

Ready to give your AI access to any document?

go-docs-mcp is free, open source, and installs in one command.

1 Or gimme a star 🤗