Paper receipts are the last mile of expense data. Everything else in a modern finance stack is structured, searchable, and automatable. Receipts are photos of thermal paper.

If you're building anything that touches expenses , a reimbursement workflow, an accounting integration, a spend analytics dashboard , you eventually need to solve the receipt parsing problem. Doing it yourself means training models, handling edge cases, dealing with a hundred different receipt formats in a hundred different languages. That's a lot of infrastructure for something that isn't your core product.

The alternative: use an API that handles the extraction and returns clean JSON.

What the JSON output contains

When Receipt Converter processes a receipt, the extracted data follows a consistent structure:

{
  "vendor": "Whole Foods Market",
  "vendor_address": "945 N Michigan Ave, Chicago, IL 60611",
  "vendor_phone": "(312) 587-0648",
  "date": "2026-02-19",
  "time": "10:42",
  "items": [
    {
      "name": "Organic Oat Milk",
      "quantity": 2,
      "unit_price": 5.98,
      "total_price": 11.96
    },
    {
      "name": "Sourdough Loaf",
      "quantity": 1,
      "unit_price": 4.49,
      "total_price": 4.49
    }
  ],
  "subtotal": 28.44,
  "tax": 1.42,
  "tax_rate": "5%",
  "tip": null,
  "total": 29.86,
  "payment_method": "Visa ****4821",
  "currency": "USD",
  "receipt_number": "WF-20260219-4821"
}

Every field is typed and consistently named across all receipts. Null fields appear when the data isn't on the receipt rather than being omitted , so your parsing code doesn't need to handle missing keys.

What this looks like in practice

See it in action

WHOLE FOODS MARKET

Order #WF-20260219-4821

Organic Oat Milk  2x5.98

Sourdough Loaf4.49

Avocado  3x5.97

Cold Brew Coffee12.00

Subtotal:28.44

Tax (5% GST):1.42

Total:29.86

Visa ****482129.86

Feb 19, 2026 · 10:42 AM

A

B

C

D

E

1

Date

Item

Qty

Amount

Category

2

Feb 19

Organic Oat Milk

2

$5.98

Groceries

3

Feb 19

Sourdough Loaf

1

$4.49

Groceries

4

Feb 19

Avocado

3

$5.97

Groceries

5

Feb 19

Cold Brew Coffee

1

$12.00

Groceries

4 items · Feb 19, 2026Total: $29.86

How to get JSON output from the UI

If you're evaluating the extraction quality before building an integration, you can test it manually:

Go to Receipt Converter and upload a receipt photo
After extraction, click Export and choose JSON
Download and inspect the output

The JSON from the UI is identical to what you'd get from an API call. Use this to verify the extraction quality on your specific receipt types before building anything.

Try it right here

Drop any receipt photo below. Results in a few seconds, free.

↑

Drop your receipt here

or click to browse — JPG, PNG, PDF, HEIC

Upload Receipt

Free to try · No account needed

Building your own extraction pipeline

If you need to process receipts programmatically, here's the approach:

Option 1: Use the Receipt Converter API (coming soon) A direct REST API is on the roadmap. Submit a multipart form with the receipt image, get back structured JSON. No model training, no infrastructure.

Option 2: Build on top of GPT-4 Vision Receipt Converter is built on GPT-4 Vision with a carefully engineered system prompt that handles edge cases: multi-tax receipts, foreign currencies, handwritten amounts, poor image quality. You can replicate this approach:

import openai
import base64

def extract_receipt(image_path: str) -> dict:
    with open(image_path, "rb") as f:
        image_data = base64.b64encode(f.read()).decode()
    
    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{image_data}",
                        "detail": "auto"
                    }
                },
                {
                    "type": "text",
                    "text": "Extract all receipt data as JSON with fields: vendor, date, items (name, quantity, unit_price, total_price), subtotal, tax, tax_rate, tip, total, currency, payment_method."
                }
            ]
        }],
        response_format={ "type": "json_object" }
    )
    
    return json.loads(response.choices[0].message.content)

The key gotchas when building this yourself:

Force JSON output mode with response_format
Handle not_a_receipt cases (someone uploads a non-receipt image)
Resize images before sending , large images slow down processing and cost more tokens
Add retry logic for API timeouts

Option 3: Use an OCR service + post-processing AWS Textract, Google Document AI, and Azure Form Recognizer can extract text from receipts. The challenge is that raw OCR output is unstructured , you still need to parse it into the fields you need. Accuracy on complex receipts (multiple tax rates, discounts, handwritten fields) is lower than vision model approaches.

Common receipt parsing edge cases

If you're building a parser, these are the cases that will catch you:

Multi-tax receipts. Some jurisdictions charge different tax rates on different items (e.g., prepared food vs. packaged goods). A single receipt might have multiple tax lines. Your schema needs to handle arrays of tax entries, not just a single tax field.

Tip lines. Restaurant receipts often have a pre-filled tip or a tip line. The pre-tip total and post-tip total both appear. Make sure you're capturing the right total for expense purposes.

Quantity formats. "2x" versus "2" versus "qty: 2" versus the item listed twice. Vision models handle these better than rule-based parsers.

Handwritten amounts. Especially on restaurant receipts where a server writes in the tip. Legibility varies but vision models generally handle clear handwriting.

Foreign currency and dates. € vs EUR, 14/03/2026 vs March 14, 2026 vs 2026-03-14. Normalize these in your post-processing.

→

Test on real receipts from your target market

If your users are primarily in one country or industry, test on receipts from that context. Restaurant receipts look different from retail receipts look different from hotel folios. The more specific your test set, the more confident you can be in your extraction quality.

Storing and querying the structured data

Once you have JSON, you can store it in any database and query it like any other structured data:

-- Find all receipts over $100 from the last 30 days
SELECT vendor, date, total, currency
FROM receipts
WHERE total > 100
  AND date >= CURRENT_DATE - INTERVAL '30 days'
ORDER BY total DESC;

-- Sum expenses by vendor for a date range
SELECT vendor, SUM(total) as total_spend, COUNT(*) as receipt_count
FROM receipts
WHERE date BETWEEN '2026-01-01' AND '2026-03-31'
GROUP BY vendor
ORDER BY total_spend DESC;

The consistent JSON structure makes this straightforward. Vendor names from the same merchant might have minor variations ("Whole Foods" vs "WHOLE FOODS MKT") , you'll want a normalization pass if exact grouping matters.

If you're not building a system but just need JSON output for occasional use, the manual export from the UI works well. The batch processing workflow covers how to handle larger volumes without writing any code.

Test the JSON extraction on your receipts. Try Receipt Converter free →

What the JSON output contains

What this looks like in practice

How to get JSON output from the UI

Building your own extraction pipeline

Common receipt parsing edge cases

Storing and querying the structured data

Invoice Converter: Extract Data From Any Vendor Invoice in Seconds

Receipt Converter: Turn Any Receipt Into Structured Data Instantly

AI Receipt Scanner: How It Works and Why It Gets Every Line Item Right