Is the PDF uploaded anywhere?

No. The validator uses pdf.js running in your browser. The file is never sent to our servers or anyone else's. You can verify this in the network tab — no upload requests fire while the tool runs.

How accurate is the digital-vs-scanned classification?

Very high in practice. We sample the first few pages and measure average text characters per page. Anything under 80 chars/page strongly suggests image-only content (a normal statement page has 300–800 chars). The edge cases — vector-overlaid scans, image-heavy digital PDFs — are rare and the converter handles both paths anyway.

Why does it sometimes say 'looks like Chase' when it isn't Chase?

The bank-guess feature looks for brand names in the extracted text. If your statement mentions another bank in a transaction description (e.g. a transfer to/from a different account), the heuristic can trigger. Treat the guess as a hint, not a fact.

What if the PDF is encrypted?

The validator reports it as encrypted and stops there — we can't inspect the contents until it's unlocked. Use our PDF password remover at /tools/unlock-pdf to get an unencrypted copy first.

Can this detect a fake or tampered statement?

Not directly — that requires forensic PDF analysis (font consistency, object stream patterns, modification timestamps). The validator is a format triage tool, not a forensics tool. If you need fraud detection, please reach out via /contact and we'll discuss what's possible.

Does it work on credit-card statements?

Yes — the heuristics work on any PDF financial document. Page count, encryption, and text-layer detection are format-agnostic.

What is the best PDF bank statement converter?

The best converter depends on your accounting software. For QuickBooks, look for one that exports a native .qbo file (Web Connect format) — that skips column mapping entirely. For Xero, a tool that produces a Xero-template CSV with merged Money-In/Out columns will save the most time. StatementEdge handles both natively, plus Sage, Excel, CSV, and generic OFX, with automatic balance-chain reconciliation that flags any extraction error before you import.

How do I convert a PDF bank statement to QuickBooks?

The most reliable path is to convert the PDF directly into a QuickBooks Web Connect file (.qbo) instead of going through CSV. CSV import in QBO requires manual column mapping and breaks on UK/EU number formats. A .qbo file imports natively with no mapping. Upload your PDF to a converter that supports .qbo output (such as StatementEdge), download the .qbo, then in QBO go to Banking → Upload from file → File upload.

How do I convert a bank statement PDF to Excel?

Use a converter that respects your bank's locale (US, UK, EU, etc.) and emits an Excel file with ISO dates and signed amount columns. Generic OCR tools often misread comma decimals as thousands separators or split Money-In/Money-Out columns into two separate amount fields. Specialised converters like StatementEdge normalise these automatically before export.

Are PDF bank statement converters safe to use?

Safety depends on three things: where the file is processed, whether it's retained, and whether your data is used for AI training. StatementEdge processes files in the EU (Frankfurt), auto-deletes the source PDF within 1 hour by default, and uses AI providers with no-training settings enabled. Look for these guarantees on any converter you're considering — many tools store statements indefinitely and may be in jurisdictions without GDPR protection.

What's the difference between QBO, QFX, and OFX file formats?

All three are based on the Open Financial Exchange (OFX) SGML format from the late 1990s. .ofx is the generic format, accepted by GnuCash, MoneyDance, Beancount, and many open-source tools. .qfx is Quicken's flavour, identical to OFX 1.0.2 with Quicken-specific headers. .qbo is Intuit's QuickBooks Desktop and Online flavour, which adds an INTU.BID tag identifying the originating financial institution. For QuickBooks use .qbo; for Quicken use .qfx; for anything else use .ofx.

How do I convert scanned (image) bank statements?

Scanned PDFs don't have a text layer, so traditional CSV-export tools fail on them. You need a converter with vision capability — one that can read pixels, not just embedded text. StatementEdge detects scanned pages automatically and routes them through a vision model with the same balance-chain verification as digital PDFs, so you still get a reconciled output even from an image-only source.

Can I batch-convert multiple bank statements at once?

Yes. StatementEdge accepts multiple PDFs in a single drag-and-drop or file selection — each becomes its own conversion job and the results page shows per-file status with the reconciliation verdict for each. For programmatic batch processing, the REST API is available on every plan (free tier included) and exposes the same conversion endpoint.

Free PDF statement validator — digital vs scanned, encrypted, page count

Why pre-flight a PDF at all?

Bank statement PDFs look uniform from the outside but differ wildly in how they're built. A "PDF" can be: a text-based document with a full text layer (the easy case), a scanned image with no extractable text, an AES-encrypted file requiring a password, or any combination of the three. The conversion path that works for one fails on another.

Most people only discover what kind of PDF they have after dragging it into a converter and getting confused output: empty rows (scanned PDF run through a text extractor), a wall of 0.00 values (locale mismatch), or a flat "couldn't read" error (encryption). This validator tells you up front, so you can pick the right tool first.

Once you know what you have, drop it into our bank statement converter — it auto-detects the format and switches between text extraction and AI vision-based OCR transparently. For encrypted PDFs, use our free PDF password remover first.

What the validator checks

Encryption. If the PDF requires a password to open, we report it immediately. No further inspection is possible until it's unlocked.
Page count. Useful for quota planning — at 7 free pages a day, a 60-page statement is a multi-day project on the free tier, but a 1-minute job on our €19 Starter plan with 500 pages a month.
Text layer presence. We sample the first few pages and measure how many text characters live in the PDF. Above 80 chars per page = digital, below = probably scanned.
Bank guess. We look for known bank brand keywords (HDFC, Lloyds, BNP Paribas, Chase, etc.) in the extracted text. This is a hint, not authoritative — white-label or business statements often don't carry the bank brand on every page.
Size hints. Anything over 60 pages prompts a warning about the free tier; anything over 50 MB is rejected outright (almost certainly a high-DPI scan).

Digital vs scanned — how to tell, and why it matters

A digital PDF is one where you can drag your mouse and highlight text. A scanned PDF is a picture of a statement — open it in any viewer and try selecting text; nothing happens. The validator does the equivalent check programmatically by asking pdf.js for the text content of each page.

The distinction matters because:

Digital PDFs convert in 5–15 seconds with near-perfect accuracy. Text extraction is deterministic; there's nothing to guess.
Scanned PDFs need vision-based OCR — much slower (20–60 seconds), and accuracy depends on scan quality. Our converter uses Gemini's vision model with a reconciliation pass to catch OCR digit errors before they ship.
Mixed PDFs (e.g. a digital cover page + scanned transaction list) get treated as scanned overall, because the high-value rows are in the scanned section.

When the validator's hints can mislead you

A few edge cases the simple heuristics miss:

Vector-rendered scans. Some bank PDFs are scanned, then "re-paginated" by software that overlays an invisible text layer. The validator will call this "digital" — which is fine for conversion (the text layer is usable) but the text may be misaligned with what you see visually.
Image-heavy digital PDFs. Statements with embedded logos and charts can push the average chars/page down. If the validator says "scanned" but you can highlight text in a viewer, ignore the hint and treat it as digital.
Bank guess collisions. "Chase" appears in non-Chase statements when transactions reference a Chase account elsewhere. The guess is heuristic, not authoritative.

What to do next

Digital PDF, English/EU locale: drop it straight into our main converter — output in 10 seconds.
Scanned PDF: same place — our pipeline switches to vision-based OCR automatically. Read how we handle scanned statements.
Encrypted PDF: unlock with our free PDF password remover first, then convert.
Long statement (60+ pages): check the plan estimator — free tier won't cover it.
Verified the output: run our reconciliation checker to confirm the totals match.

Statement Format Validator — know what you're dealing with