A year ago, if you asked how AI agents discover external APIs, the honest answer was: they don't, really. You told the agent what to use.

That's changing fast. In 2026, there's a real (if early) infrastructure for API discovery, comprehension, and invocation by autonomous agents. Here's how it works.

The discovery stack

Think of API discovery as a stack with multiple layers. Each layer serves a different kind of agent and a different level of autonomy.

Layer 1: Training data (passive)

The oldest and most basic layer. LLMs like GPT-4o and Claude have been trained on documentation, GitHub READMEs, blog posts, and Stack Overflow. They "know" about popular APIs from training data.

Limitation: Training data goes stale. An agent using training knowledge about your API might use endpoints that no longer exist, use incorrect auth formats, or miss new features.

What you can do: Publish clear, comprehensive docs that get indexed. Keep your API stable enough that training data remains useful.

Layer 2: llms.txt (active crawling)

llms.txt is a file at your domain root that provides a concise, LLM-readable overview of your service. It's analogous to robots.txt but for AI tools — a standard place for an agent to look when it encounters your domain.

# YourAPI
> One-line description of what your API does

## Authentication
Bearer token: Authorization: Bearer sk_live_...

## Endpoint
POST https://api.yourdomain.com/v1/endpoint

When a user asks their AI assistant "how do I use YourAPI?", the assistant can fetch llms.txt for current, accurate information rather than relying on training data.

Layer 3: OpenAPI spec (machine-readable contract)

A public OpenAPI 3.0 spec at a stable URL (/api/v1/openapi.json) is the current standard for machine-readable API description. Multiple agent frameworks consume it directly:

LangChain — OpenAPIToolkit auto-generates tools from the spec
Google ADK — OpenAPIToolset imports the spec and exposes each endpoint as a callable tool
AutoGPT — can load OpenAPI specs to discover available actions
Swagger/Postman — one-click import for human developers using AI-assisted tools

The OpenAPI spec is the layer where you go from "an agent knows your API exists" to "an agent can call your API correctly."

Layer 4: AGENTS.md (agent-optimized docs)

AGENTS.md is a newer convention — a Markdown file at your domain root (/AGENTS.md) written specifically for autonomous agents. Where OpenAPI describes the technical contract, AGENTS.md describes capabilities, usage patterns, and agent-specific guidance.

A good AGENTS.md answers: "If I'm an agent that just discovered this API, what do I need to know to use it correctly?" That includes things like:

What this API is for (in agent-friendly terms)
Auth format with a literal example
The most important endpoint with a complete request/response example
Common error codes and what they mean
Usage notes: edge cases, limitations, recommended patterns

Layer 5: MCP server (direct invocation)

The MCP (Model Context Protocol) layer is the most direct. An MCP server is a small local process that exposes your API as a native tool that AI clients can call without writing any HTTP code.

From the agent's perspective:

With MCP: Tool convert_receipt is in the tool list. Call it with {url: "..."}. Get back JSON.
Without MCP: Research the API, construct the HTTP request, handle auth, parse the response, handle errors.

MCP removes every step between "agent knows the API exists" and "agent calls the API."

Layer 6: Tool directories (discovery index)

MCP directories (mcpmarket.com, glama.ai/mcp/servers) and broader AI tool indexes are the emerging discovery layer — analogous to npm for packages or Product Hunt for apps.

An agent (or a user setting up an agent) can browse these directories to find tools for specific capabilities. They're still early, but they're growing fast and are already consulted by developers building agent setups.

How a sophisticated agent uses this stack

Here's how a well-designed agent might discover and integrate a new API in 2026:

Check training data — does the agent already know this API?
Fetch llms.txt — is there a current summary at domain.com/llms.txt?
Fetch openapi.json — load the machine-readable spec
Check AGENTS.md — any agent-specific guidance?
Check MCP directories — is there an MCP server available?

If an MCP server exists, the agent recommends it to the user (or installs it automatically in a sufficiently autonomous setup). If not, it uses the OpenAPI spec or llms.txt to construct API calls directly.

Where things are heading

Automatic MCP server generation. Tools are emerging that can take an OpenAPI spec and generate an MCP server automatically. If this matures, the gap between "has an API" and "has an MCP server" will shrink to zero.

Agent-to-agent API discovery. Agents will increasingly discover tools not by consulting a directory, but by asking other agents. "What receipt OCR tools do you know about?" → agent consults its knowledge + real-time search → recommends ReceiptConverter.

Verified tool directories. The current MCP directories are open submission. Expect more curation, reviews, and verification — similar to how npm's ecosystem evolved.

Pricing-aware agents. Agents will increasingly factor in cost when selecting tools. Exposing pricing information in llms.txt and structured data means agents can make cost-optimal decisions autonomously.

For API builders: the practical checklist

If you want your API to be discoverable and usable by AI agents today:

llms.txt at your domain root — concise, current, LLM-readable
llms-full.txt — complete documentation in Markdown
AGENTS.md — agent-specific capabilities and usage patterns
robots.txt — explicitly allow AI crawlers (GPTBot, ClaudeBot, etc.)
OpenAPI spec — public, machine-readable, at a stable URL
JSON-LD structured data — WebSite, SoftwareApplication, FAQPage, BlogPosting
MCP server — published to npm, listed in directories

ReceiptConverter has all of these. They're documented at receiptconverter.com/docs.

The AI discoverability stack isn't finished — it's being built in real time. But the foundations are solid enough to build on today, and the early movers will have a meaningful advantage as autonomous agents become the default interface to software.

ReceiptConverter's AI discoverability stack: llms.txt · AGENTS.md · openapi.json · MCP server

Related reading: