Quality Assurance· May 17, 2026 · 9 min read

Lending-grade QA: tiered routing replaces a single threshold

85% overall is a demo bar. For a bank issuing millions in loans we replaced the single threshold with hard gates, three routing outcomes, and per-field confidence.

The v1 QA framework treated a missing decimal point the same as a snake_case violation: both cost you points against a single overall score. For a chatbot demo that is fine. For a lender deciding whether to extend a seven-figure mortgage, it is not.

The v2 framework — now live — replaces the single approve/reject decision with three routing outcomes. auto_pass is written to the subject database automatically. review_queue is held for a human operator. hard_reject is refused outright and the document never reaches the database.

Routing is determined by deterministic hard gates first, then by tiered field-level confidence. Hard gates are code, not LLM judgments: identifier checksums (NRIC Malaysia, NIK Indonesia, NPWP, PAN, SSN), numeric reconciliation (sum of line items must equal stated totals), and date envelope checks (every date inside the document's stated reporting period).

Above the gates sit per-source thresholds. CCRIS, because mistakes there move loan decisions, requires 95% overall and 99.5% on critical fields. SSM company registry data — lower stakes, mostly identity confirmation — sits at 90% / 97%. The thresholds are tunable per source type and recalibrated weekly against a human-labeled gold set.

Finally, a third model — an independent verifier — runs in parallel with the maker. Disagreement on a critical field between maker and verifier routes the document to review_queue regardless of the architect's score. Belt and suspenders, on purpose.