Home » How it works
How Bank2XL works
A three-step extraction pipeline: drop → extract → verify.
Short version: drop a PDF, our AI extracts every transaction with original column labels and currency symbols preserved, and we automatically reconcile the totals to the statement's reported opening / closing balance. If the numbers don't match, we tell you — instead of silently handing you bad data.
1. Drop the PDF
Open the Bank2XL extension popup (or web converter) and drag your bank statement PDF onto the drop zone. The file is sent over TLS to our API and held in memory only. It is never written to durable storage.
- Supported: any PDF, including scanned and image-based statements.
- Size limit: 20 MB (covers virtually all consumer statements).
- One file at a time on the popup; batch upload is on the roadmap.
2. AI extraction
Our extractor pre-routes each document by whether it has a real text layer or is image-only:
- Text-layer PDFs — extracted with PyMuPDF for layout-aware parsing, then validated by a Gemini 3.1 vision model.
- Scanned PDFs and image PDFs — rasterized page-by-page and run through Datalab Chandra OCR + Gemini for table reconstruction.
In parallel, a "judge" model identifies the bank, country, and document structure. If it disagrees with the pre-route choice, we cancel and retry on the right path. This catches edge cases like rasterized statements that have a fake text layer.
What we extract
- Every transaction row — date, description, amount, polarity (debit / credit), balance after.
- Every metadata field on the cover page — account number, holder, period, sort code, IBAN, BIC, branch, anything labeled on the PDF.
- The original column headers, verbatim. "Débito" stays "Débito"; we don't translate.
- Source page numbers, so you can jump from a transaction back to the PDF.
3. Balance verification
This is the step legacy converters skip. For every account in the statement, we compute:
opening_balance + sum(credits) − sum(debits) = closing_balance ?
The result determines the badge color on the result page:
| Status | Meaning |
| reconciled | Totals match within 0.5%. Trust the row. |
| no_balance | No opening / closing balance found in the PDF. Sums shown but not verified. |
| insufficient_data | No transactions extracted (header-only document). |
| incomplete_source | The PDF says more pages should exist than were uploaded. Re-export. |
| mismatch | Totals differ by more than 0.5%. Investigate. |
| tx_extraction_incomplete | The statement's own Activity Summary reports more transactions than we extracted. |
4. Output
- Excel (.xlsx) — multi-sheet workbook with Account info, Transactions, and a Validation sheet showing the reconciliation status and any deltas.
- CSV — a single flat file ready for QuickBooks, Xero, or any spreadsheet tool.
- QBO (QuickBooks Online .qbo) — on the roadmap. Until then, the CSV imports cleanly via QuickBooks's "Upload bank transactions" flow.
What we don't do
- We don't categorize transactions (groceries / dining / etc.). Bank2XL extracts; you categorize in your accounting tool.
- We don't store the PDF. It's processed in memory and discarded as soon as the result is returned. See Data retention.
- We don't train AI on your data. See Security.
Join the waitlist See a sample output