Scanned PDFs

OCR bank statement converter — scanned PDFs to clean data

If your bank statement is a scan, a photograph, or a 'fake PDF' (image wrapped in a PDF), normal text-extraction tools return nothing. Drop it here — we read the pixels, extract every transaction, and verify the totals before you download.

Drop one or many PDFs, or click to upload

Digital or scanned · up to 25 MB per file · hold Cmd/Ctrl to select several at once

No signup · EU-hosted · uploads auto-delete

Auto-reconciled
Opening + transactions = closing. Flags any extraction error before you import.
Locale-aware
USD, EUR, GBP, INR, AUD, CAD. Comma decimals, DD/MM/YYYY, Money-In/Out columns — handled natively.
Auto-delete uploads
Source PDFs purged within an hour. We keep only the data you can already see and export.

Why scanned bank statements are harder

A digital PDF has a text layer underneath what you see — copying text from it gives you the original characters. A scanned PDF has no text layer; it's just an image wrapped in a PDF container. Most "PDF to Excel" tools open it, find no text, and return an empty file or fail outright.

Optical character recognition (OCR) reads the pixels and converts them back into text. Traditional OCR is famously brittle on bank statements — it confuses 0 with O, misreads small fonts, splits multi-line transaction descriptions arbitrarily, and has no awareness of column structure.

We use a vision-based approach that understands the document structure, not just isolated characters — and then we verify the result with the same balance-chain reconciliation we use for digital PDFs.

What you can convert

  • Scanned PDFs. A bank statement scanned at the branch counter, on a flatbed scanner, or via a mobile scanning app — all work.
  • Photographed statements. A picture of a paper statement taken on a phone, saved as PDF.
  • Mixed PDFs. Some pages digital, some scanned (common with consolidated multi-month statements). We detect per page.
  • Image-only "flat" PDFs. PDFs that print to a flat image instead of preserving the original text layer.

Reconciliation catches what OCR alone can't

Even the best vision system makes the occasional mistake on a small font or smudged ink. What protects you isn't the OCR — it's the verification layer that comes after.

We check:

  • Opening balance + sum of every amount = closing balance, within rounding tolerance.
  • Each row's printed running balance = previous balance + this row's amount.

The first row that breaks either check is the row that was misread. The results page flags it visually so you only review the actual problem rows, not the whole statement.

Tips to get the cleanest result

  • Scan at 300 DPI or higher. Below 200 DPI, small fonts (like cents columns) lose detail.
  • Crop or deskew if you can. Slight rotations are fine; major skew degrades accuracy.
  • Skip flash if you're photographing — uniform lighting beats brightness.
  • Save as PDF (not JPG). The PDF wrapper is fine; we read it the same way.

FAQ

Can you OCR scanned bank statements for free?
Yes. 7 pages per day, no signup, no credit card. Scanned PDFs use the same free-tier allowance as digital ones.
How accurate is OCR on scanned bank statements?
Accuracy depends heavily on scan quality. A clean 300+ DPI scan typically extracts above 98% of transactions correctly. Lower-quality scans get worse — but the balance-chain reconciliation catches any misread row whether the source was digital or scanned, so you always know what's verified vs. flagged.
What about scans from old paper statements?
Yes, as long as the text is legible to the human eye. We handle typewriter-era statements, dot-matrix printouts, and modern laser-printed PDFs the same way.
Do you support multi-page scanned statements?
Up to 100 pages per file (25 MB cap). Page-break artefacts like 'balance carried forward' lines are recognised and skipped — they don't end up in the transaction list.
What if my scan is upside-down or rotated?
Slight rotation (a few degrees) is handled automatically. 90° / 180° rotations are also detected. If a scan is severely skewed or unreadable to humans, the conversion will likely fail loudly rather than silently mis-extract — let us know and we'll dig in.

Related