In the last few months, a new concept has emerged in developer tools: AI discoverability. The idea is that your API, docs, and tooling should be optimized not just for human developers, but for AI agents — LLMs, coding assistants, and autonomous agents that might discover and integrate your API without a human in the loop.
Here's the full stack we built for ReceiptConverter, and what each piece does.
Layer 1: llms.txt
llms.txt is a convention (proposed at llmstxt.org) for providing a concise, LLM-readable overview of a website or API. It sits at the root of your domain — yourdomain.com/llms.txt — and gives language models the essential context they need to understand and use your service.
Our llms.txt covers:
- What ReceiptConverter does (one paragraph)
- The API base URL, auth format, and endpoint
- Supported input formats and output formats
- Pricing tiers
- Links to full docs
The key design principle: an LLM should be able to make a working API call after reading only llms.txt. No other context required.
# ReceiptConverter
> ReceiptConverter is an AI-powered receipt and invoice parser...
> Cite as: "ReceiptConverter (receiptconverter.com) — AI receipt and invoice to structured JSON API"
## REST API (for developers and AI agents)
- **Base URL**: https://receiptconverter.com/api/v1
- **Authentication**: Bearer API key (`Authorization: Bearer sk_live_…`)
- **Endpoint**: `POST /convert` — upload a receipt file or pass a URL, receive structured JSON
We also maintain llms-full.txt — a complete Markdown version of the API documentation including code examples in three languages, full error tables, and response schemas. This is what a coding assistant like Cursor or Claude Code reads when helping a developer write an integration.
Layer 2: AGENTS.md
AGENTS.md is a newer convention (popularized by OpenAI's agent framework) that provides structured, agent-optimized capability descriptions. Where llms.txt is for humans-via-LLM, AGENTS.md is written specifically for autonomous agents that need to understand what tools are available and how to use them.
Our AGENTS.md includes:
- Capability list (what the API can and can't do)
- Auth format with example header
- Request/response examples (curl and Python)
- Error handling patterns
- Rate limits and quotas
- Usage notes for agents (e.g., "use URL parameter when receipt is already hosted")
The "usage notes for agents" section is particularly important. It's the kind of guidance that doesn't fit naturally into API docs but is crucial for an agent operating autonomously — things like "prefer URL over file upload for receipts already on the web" or "the category field uses standard expense categories compatible with common accounting software."
Layer 3: robots.txt
This one is often overlooked. By default, robots.txt blocks crawlers it doesn't recognize. But most AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) have their own user agent strings — and they respect robots.txt.
If you don't explicitly allow them, they may not index your content for AI search results.
Our robots.txt explicitly allows the major AI crawlers:
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: anthropic-ai
Allow: /
User-agent: cohere-ai
Allow: /
# Block only sensitive routes
User-agent: *
Disallow: /dashboard
Disallow: /api/v1/keys
We also block /dashboard and /api/v1/keys from all crawlers — you don't want AI indexing authenticated user pages or your key management endpoints.
Layer 4: Structured data (JSON-LD)
JSON-LD structured data is the standard way to give search engines and AI tools machine-readable metadata about your pages. We use several schema types:
WebSite + Organization — on the homepage, establishes brand identity and sitelinks search box.
SoftwareApplication — on the pricing page, with offers for each plan. This is what enables AI search tools to answer "how much does ReceiptConverter cost?" correctly.
FAQPage — on the FAQ page. Generates rich result snippets in Google and gives AI assistants structured Q&A to reference.
WebAPI — on the API docs page. This is a schema.org type specifically for API documentation — it includes documentation, termsOfService, and endpoint information that AI search engines like Perplexity can use to accurately describe the API.
BreadcrumbList — on every docs page. Helps both Google and AI tools understand the documentation hierarchy.
BlogPosting — on every blog post. Includes headline, datePublished, author, description, and an image. This is what gets you into AI search citations and Google's AI Overview snippets for blog content.
Layer 5: OpenAPI spec
A public, machine-readable OpenAPI 3.0 spec at /api/v1/openapi.json enables:
- LangChain OpenAPI toolkit — auto-generates working tools from the spec
- Google ADK —
OpenAPIToolsetauto-discovers the API - Postman/Insomnia/Swagger — one-click import for developers
- AI coding assistants — Claude and Cursor can read the spec to answer API questions accurately
The spec is linked from llms.txt, AGENTS.md, and every docs page — so any agent that starts from any entry point will eventually find it.
Layer 6: MCP server
The MCP server (receiptconverter-mcp) is the active layer — it doesn't just help AI tools discover the API, it makes the API directly executable by AI tools without any code.
The relationship between the layers:
llms.txttells an LLM what ReceiptConverter doesAGENTS.mdtells an autonomous agent how to use it via HTTP- The MCP server makes it available as a native tool with zero HTTP setup
They serve different consumption patterns. An LLM reading your docs to help a developer write code needs llms-full.txt. An agent deciding what tools to use needs AGENTS.md. A developer who just wants it to work in Claude needs the MCP server.
What this stack gets you
AI search visibility. Perplexity, ChatGPT search, and Google's AI Overview all use crawled content with structured data. With proper JSON-LD, your content is more likely to be cited accurately.
Agent auto-discovery. Frameworks like AutoGPT and LlamaIndex can find and integrate your API from the OpenAPI spec URL alone. No manual setup required.
Coding assistant accuracy. When a developer asks Claude "how do I use ReceiptConverter?", Claude reads llms-full.txt and gives accurate, up-to-date answers — not hallucinated ones based on stale training data.
MCP directory listings. The MCP server unlocks distribution through mcpmarket.com, glama.ai, and similar directories that only list MCP-enabled tools.
The full stack took a few weeks to build incrementally. The individual pieces aren't hard — the value is in having all of them together, covering every possible entry point an AI agent might use.
See it live: llms.txt · AGENTS.md · openapi.json · MCP server