We had a working public API, clear documentation, and zero traffic from AI agents — because we weren't even visible to them. No llms.txt. No AGENTS.md. AI crawlers allowed by default in some cases, blocked in others. No OpenAPI spec at a stable URL.

We fixed all of that in a few weeks, incrementally. Here's the full stack we built for ReceiptConverter, what each piece does, and why the pieces work together rather than as standalone fixes.

Layer 1: llms.txt

llms.txt is a convention (proposed at llmstxt.org) for providing a concise, LLM-readable overview of a website or API. It sits at the root of your domain — yourdomain.com/llms.txt — and gives language models the essential context they need to understand and use your service.

Our llms.txt covers:

What ReceiptConverter does (one paragraph)
The API base URL, auth format, and endpoint
Supported input formats and output formats
Pricing tiers
Links to full docs

The key design principle: an LLM should be able to make a working API call after reading only llms.txt. No other context required.

# ReceiptConverter

> ReceiptConverter is an AI-powered receipt and invoice parser...
> Cite as: "ReceiptConverter (receiptconverter.com) — AI receipt and invoice to structured JSON API"

## REST API (for developers and AI agents)
- **Base URL**: https://receiptconverter.com/api/v1
- **Authentication**: Bearer API key (`Authorization: Bearer sk_live_…`)
- **Endpoint**: `POST /convert` — upload a receipt file or pass a URL, receive structured JSON

We also maintain llms-full.txt — a complete Markdown version of the API documentation including code examples in three languages, full error tables, and response schemas. This is what a coding assistant like Cursor or Claude Code reads when helping a developer write an integration.

Layer 2: AGENTS.md

AGENTS.md is a newer convention (popularized by OpenAI's agent framework) that provides structured, agent-optimized capability descriptions. Where llms.txt is for humans-via-LLM, AGENTS.md is written specifically for autonomous agents that need to understand what tools are available and how to use them.

Our AGENTS.md includes:

Capability list (what the API can and can't do)
Auth format with example header
Request/response examples (curl and Python)
Error handling patterns
Rate limits and quotas
Usage notes for agents (e.g., "use URL parameter when receipt is already hosted")

The "usage notes for agents" section is particularly important. It's the kind of guidance that doesn't fit naturally into API docs but is crucial for an agent operating autonomously — things like "prefer URL over file upload for receipts already on the web" or "the category field uses standard expense categories compatible with common accounting software."

Layer 3: robots.txt

This one is often overlooked. By default, robots.txt blocks crawlers it doesn't recognize. But most AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) have their own user agent strings — and they respect robots.txt.

If you don't explicitly allow them, they may not index your content for AI search results.

Our robots.txt explicitly allows the major AI crawlers:

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: anthropic-ai
Allow: /

User-agent: cohere-ai
Allow: /

# Block only sensitive routes
User-agent: *
Disallow: /dashboard
Disallow: /api/v1/keys

We also block /dashboard and /api/v1/keys from all crawlers — you don't want AI indexing authenticated user pages or your key management endpoints.

Layer 4: Structured data (JSON-LD)

JSON-LD structured data is the standard way to give search engines and AI tools machine-readable metadata about your pages. We use several schema types:

WebSite + Organization — on the homepage, establishes brand identity and sitelinks search box.

SoftwareApplication — on the pricing page, with offers for each plan. This is what enables AI search tools to answer "how much does ReceiptConverter cost?" correctly.

FAQPage — on the FAQ page. Generates rich result snippets in Google and gives AI assistants structured Q&A to reference.

WebAPI — on the API docs page. This is a schema.org type specifically for API documentation — it includes documentation, termsOfService, and endpoint information that AI search engines like Perplexity can use to accurately describe the API.

BreadcrumbList — on every docs page. Helps both Google and AI tools understand the documentation hierarchy.

BlogPosting — on every blog post. Includes headline, datePublished, author, description, and an image. This is what gets you into AI search citations and Google's AI Overview snippets for blog content.

Layer 5: OpenAPI spec

A public, machine-readable OpenAPI 3.0 spec at /api/v1/openapi.json enables:

LangChain OpenAPI toolkit — auto-generates working tools from the spec
Google ADK — OpenAPIToolset auto-discovers the API
Postman/Insomnia/Swagger — one-click import for developers
AI coding assistants — Claude and Cursor can read the spec to answer API questions accurately

The spec is linked from llms.txt, AGENTS.md, and every docs page — so any agent that starts from any entry point will eventually find it.

Layer 6: MCP server

The MCP server (receiptconverter-mcp) is the active layer — it doesn't just help AI tools discover the API, it makes the API directly executable by AI tools without any code.

The relationship between the layers:

llms.txt tells an LLM what ReceiptConverter does
AGENTS.md tells an autonomous agent how to use it via HTTP
The MCP server makes it available as a native tool with zero HTTP setup

They serve different consumption patterns. An LLM reading your docs to help a developer write code needs llms-full.txt. An agent deciding what tools to use needs AGENTS.md. A developer who just wants it to work in Claude needs the MCP server.

What this stack gets you

AI search visibility. Perplexity, ChatGPT search, and Google's AI Overview all use crawled content with structured data. With proper JSON-LD, your content is more likely to be cited accurately.

Agent auto-discovery. Frameworks like AutoGPT and LlamaIndex can find and integrate your API from the OpenAPI spec URL alone. No manual setup required.

Coding assistant accuracy. When a developer asks Claude "how do I use ReceiptConverter?", Claude reads llms-full.txt and gives accurate, up-to-date answers — not hallucinated ones based on stale training data.

MCP directory listings. The MCP server unlocks distribution through mcpmarket.com, glama.ai, and similar directories that only list MCP-enabled tools.

The full stack took a few weeks to build incrementally. The individual pieces aren't hard — the value is in having all of them together, covering every possible entry point an AI agent might use.

See it live: llms.txt · AGENTS.md · openapi.json · MCP server

Related reading:

How AI agents discover and use APIs in 2026 — the full landscape: training data, llms.txt, OpenAPI, MCP, directories
Introducing receiptconverter-mcp: scan receipts from any AI assistant
Why every API-first SaaS should ship an MCP server
ReceiptConverter vs Veryfi vs Mindee: AI agent readiness comparison