Glama

Go-Docs MCP

Name: go-docs-mcp
Author: Drolosoft

Install and Go — your AI reads any document

macOS Linux MIT License Open Source MCP Server

View on GitHub Quick Start

The Problem

Every document MCP server needs Node.js or Python. They only handle one format — and none of them do OCR, table extraction, or image reading.

The Solution

go install and you’re done. One binary, 13 tools. PDF, DOCX, Markdown, images, OCR — no runtime, no config.

Why Go-Docs MCP?

The MCP ecosystem is drowning in Node and Python. A Go option stands out simply by being different. People who run lean infrastructure — self-hosters, DevOps, terminal users — actively prefer compiled binaries over interpreted runtimes. If you already have Go, this is the fastest path from zero to document-reading AI.

	Node / TS MCPs	Python MCPs	go-docs-mcp
⚡	Requires Node.js	Requires Python + pip	Single binary, no runtime
📄	Single format only	Single format only	PDF + TXT + MD + DOCX + CSV + images
👁	No OCR	No OCR	OCR for scanned PDFs + images
📊	Limited tables	Basic tables	Tables + outline + images + caching
🔒	Security varies	Security varies	Read-only, directory-locked

Features

📄

Read Any Document

Extract full text from PDF, TXT, MD, CSV, DOCX, and images (PNG/JPG/TIFF) with page-level granularity. Your AI reads any document format from a single server.

🔍

Search

Full-text search within documents with contextual results. Find exactly what you need across hundreds of pages.

📷

Image Extraction

Extract embedded images from document pages. Diagrams, charts, photos — pulled out and ready for analysis.

🌐

URL Fetch

Fetch and read documents from URLs. Your AI grabs a document from the web, caches it locally, and reads it like any local file.

🔒

Security

All processing happens locally. No documents leave your machine. No cloud APIs, no data exfiltration risk.

⚡

Fast Caching

Parsed documents are cached for instant repeat access. First read extracts, subsequent reads are near-zero latency.

👁

OCR

Read scanned PDFs and image files (PNG, JPG, TIFF) via OCR. Automatic fallback for image-based documents, force OCR when needed.

Architecture

go-docs-mcp is a single Go binary that communicates via stdio using the Model Context Protocol. It delegates to poppler-utils, tesseract, and pandoc for format-specific extraction.

# Data flow
🤖 AI (Claude, ChatGPT, etc.)  →  MCP (stdio)  →  go-docs-mcp (~10MB)
                                                         ↓
              📄 PDF (poppler)  📝 TXT/MD/CSV  🗎 DOCX (pandoc)  📷 Images (tesseract OCR)
                                                         ↓
                                               Parsed → cached → AI

13 tools in 5 categories: Discovery (2), Reading (3), Search (1), Analysis (4), OCR (2).

13 MCP Tools

A comprehensive set of tools covering multi-format document reading, search, and extraction.

Tool	Description
`list_documents`	List all documents in the configured directory with format detection
`read_document`	Read full text or specific pages from any supported document
`search_document`	Search within a document for text with contextual results
`get_document_summary`	Get a summary of the document structure and content
`get_document_metadata`	Extract title, author, dates, page count, and format info
`get_document_outline`	Extract document outline — headings, TOC, structure
`extract_tables`	Extract table structures from documents
`extract_images`	Extract embedded images from document pages
`read_url`	Fetch a document from a URL, cache locally, and read it
`ocr_document`	Force OCR on scanned PDFs or image-based documents
`read_image`	OCR standalone images (PNG, JPG, TIFF)
`list_formats`	Show supported formats and installed dependencies
`convert_to_markdown`	Convert any document to clean Markdown format

Quick Start

Install with one command, configure in 30 seconds.

# 1. Install
go install github.com/drolosoft/go-docs-mcp@latest

# 2. Add to your MCP config (Claude Desktop, etc.)
{
  "mcpServers": {
    "docs": {
      "command": "go-docs-mcp",
      "env": {
        "DOCS_MCP_DIR": "/path/to/your/documents"
      }
    }
  }
}

# 3. Restart your AI client — done

Requirements

A Go toolchain for installation. Format-specific deps are optional — install only what you need.

Built With

A single Go binary with optional dependencies per format.

Go 1.25+ — Single binary, cross-platform compilation
MCP SDK (Go) — Model Context Protocol via stdio
poppler-utils — PDF text, image, and metadata extraction
tesseract + pandoc — OCR for images/scans, DOCX conversion

Star on GitHub

I build tools I wish existed, then give them away.
If one of them saved you time, a coffee keeps the next one coming.

Ready to give your AI access to any document?

go-docs-mcp is free, open source, and installs in one command.

View on GitHub Back to Drolosoft