Blog/General

Building an AI Expense Agent That Scans Receipts Automatically

February 15, 2026 · 4 min read

Manual expense reporting is a solved problem. The tools to automate it completely — from inbox to ledger — have been available for a while. What's changed in 2026 is that large language models now make it trivially easy to build agents that understand receipts instead of just pattern-matching them.

In this tutorial, we'll build an autonomous expense agent in Python that:

  1. Watches a Gmail inbox for receipt-like emails
  2. Downloads any image/PDF attachments
  3. Parses them using the ReceiptConverter API
  4. Classifies the expense category using an LLM
  5. Logs structured data to a Google Sheet

You can run this locally or deploy it as a cron job.


Prerequisites

  • A ReceiptConverter API key (sk_live_...)
  • A Google Cloud project with Gmail API + Sheets API enabled
  • Python 3.11+
  • pip install google-auth google-auth-oauthlib google-api-python-client requests

Architecture

Gmail inbox
    ↓
Gmail API (watch for unread with attachments)
    ↓
ReceiptConverter API (parse → JSON)
    ↓
GPT-4o / Claude (optional re-categorization)
    ↓
Google Sheets API (append row)

Step 1 — Watch Gmail for receipt emails

from googleapiclient.discovery import build
from google.oauth2.credentials import Credentials

def get_receipt_emails(service):
    """Fetch unread emails with PDF or image attachments."""
    results = service.users().messages().list(
        userId="me",
        q="is:unread has:attachment (filename:pdf OR filename:jpg OR filename:png)",
        maxResults=20,
    ).execute()
    return results.get("messages", [])

Step 2 — Download the attachment

import base64, os

def download_attachment(service, msg_id, attachment_id, filename):
    att = service.users().messages().attachments().get(
        userId="me", messageId=msg_id, id=attachment_id
    ).execute()
    data = base64.urlsafe_b64decode(att["data"])
    path = f"/tmp/{filename}"
    with open(path, "wb") as f:
        f.write(data)
    return path

Step 3 — Parse with ReceiptConverter

import requests, os

API_KEY = os.environ["RECEIPTCONVERTER_API_KEY"]

def parse_receipt(file_path: str) -> dict | None:
    with open(file_path, "rb") as f:
        res = requests.post(
            "https://receiptconverter.com/api/v1/convert",
            headers={"Authorization": f"Bearer {API_KEY}"},
            files={"file": f},
            timeout=30,
        )
    if not res.ok:
        return None
    return res.json().get("data")

Step 4 — Log to Google Sheets

def append_to_sheet(sheets_service, spreadsheet_id, data: dict):
    row = [
        data.get("date", ""),
        data.get("vendor", ""),
        data.get("total", ""),
        data.get("currency", "USD"),
        data.get("category", ""),
        data.get("payment_method", ""),
    ]
    sheets_service.spreadsheets().values().append(
        spreadsheetId=spreadsheet_id,
        range="Expenses!A:F",
        valueInputOption="USER_ENTERED",
        body={"values": [row]},
    ).execute()

Step 5 — Put it together

def run_agent():
    creds = Credentials.from_authorized_user_file("token.json")
    gmail  = build("gmail",  "v1", credentials=creds)
    sheets = build("sheets", "v4", credentials=creds)

    SHEET_ID = os.environ["GOOGLE_SHEET_ID"]
    messages = get_receipt_emails(gmail)

    for msg_meta in messages:
        msg = gmail.users().messages().get(userId="me", id=msg_meta["id"]).execute()
        parts = msg.get("payload", {}).get("parts", [])

        for part in parts:
            if part.get("filename") and part.get("body", {}).get("attachmentId"):
                path = download_attachment(
                    gmail, msg_meta["id"],
                    part["body"]["attachmentId"],
                    part["filename"]
                )
                data = parse_receipt(path)
                if data:
                    append_to_sheet(sheets, SHEET_ID, data)
                    print(f"✓ {data.get('vendor')} — ${data.get('total')}")

        # Mark as read
        gmail.users().messages().modify(
            userId="me", id=msg_meta["id"],
            body={"removeLabelIds": ["UNREAD"]}
        ).execute()

if __name__ == "__main__":
    run_agent()

Optional — Add LLM re-categorization

The ReceiptConverter API already returns a category field, but you can refine it further with a quick LLM call:

import anthropic

claude = anthropic.Anthropic()

def refine_category(vendor: str, items: list) -> str:
    item_names = ", ".join(i["name"] for i in items[:5])
    msg = claude.messages.create(
        model="claude-3-5-haiku-20241022",
        max_tokens=50,
        messages=[{
            "role": "user",
            "content": f"Categorize this expense for accounting: vendor={vendor}, items={item_names}. Reply with ONE category only."
        }]
    )
    return msg.content[0].text.strip()

Run as a cron job

# Run every 30 minutes
*/30 * * * * cd /path/to/agent && python agent.py >> /var/log/expense-agent.log 2>&1

Or deploy as a serverless function on Vercel, Fly.io, or AWS Lambda with a scheduled trigger.


The full agent is ~100 lines of Python and requires no infrastructure beyond a free Google Cloud project. The ReceiptConverter API handles the hard part — understanding messy receipt images — while the rest is just data plumbing.

Get your API key at receiptconverter.com/dashboard and check the Python guide for more code examples.

Try it on your own receipts

Free to start. No account, no credit card.

Try free →