Government and statutory PDFs — credit, tax, employment, company registry — turned into structured JSON over a single REST endpoint. Built for the teams that consume this data: banks, credit reporting agencies, fintechs, researchers, and internal tooling.
{
"applicant": {
"name": "Rohan Mehta",
"pan": "ABCDE1234F",
"dob": "1989-04-12"
},
"credit": {
"score": 782,
"bureau": "CIBIL",
"report_date": "2026-05-08",
"accounts": [
{ "lender": "HDFC", "type": "credit_card", "limit": 500000, "utilization": 0.18 },
{ "lender": "SBI", "type": "home_loan", "outstanding": 4280000, "emi": 38200 }
],
"defaults": []
},
"employment": { "employer": "Acme Pvt Ltd", "income": 2400000, "tenure_months": 54 }
}every JSON graded by an architect model before it touches the database
maker model rewrites until the architect approves — or the doc is rejected
same REST endpoint for every source, every country, every consumer
Stop building scrapers. Vertitos ingests the PDFs issued by statutory authorities and regulators, and returns clean JSON — each source typed, namespaced, and traceable to the original document.
CCRIS (BNM), EPF/KWSP, LHDN, SSM. Statutory and regulator-issued PDFs, parsed to source-specific schemas.
SLIK OJK, BPJS, DJP. Statutory and regulator-issued PDFs, parsed to source-specific schemas.
Every request hashed and logged. Field-level confidence scores. Replay any call.
Handles 200-page reports, nested account tables, scanned filings, the lot.
Validated outputs. If a field is missing, you know — not a hallucinated value.
REST + bearer key. No SDK lock-in, no queues to manage.
Every PDF runs deterministic hard gates (identifier regex, totals, date envelopes) before any LLM scores it. A Maker extracts. An independent Verifier re-reads the PDF in parallel. An Architect adjudicates against 10 KPIs with per-source thresholds. Each document lands in exactly one of three buckets.
Deterministic regex, checksum, sum-of-parts, date-window checks. Failures short-circuit before any model spend.
Gemini 2.5 Flash extracts. Cheap, fast, structured output.
Independent model re-reads the same PDF. Critical-field disagreement forces review.
Gemini 2.5 Pro scores 10 KPIs, applies per-source thresholds, routes the document.
auto_passAll hard gates green, KPIs above source threshold, Verifier agrees on critical fields. Writes straight to the database.
review_queueGates pass but KPIs sit between source minimum and target, or Verifier disagrees on a non-critical field. Operator adjudicates in the review queue.
hard_rejectA hard gate fails, no strong identifier is captured, or a critical KPI scores below the floor. JSON withheld. Audit row written.
Critical KPIs are non-negotiable: any failure routes the document to hard reject, regardless of overall score.
schema_validcriticalParses as JSON. Required top-level keys (identifiers, subject_kind, display_name) all present.
field_completenessPercentage of visible PDF fields actually captured in the output.
identifier_validitycriticalIDs match the regex for their type — NRIC, NIK, NPWP, SSM, PAN, passport.
type_fidelityNumbers as numbers, dates as ISO YYYY-MM-DD, booleans as booleans. No stringified anything.
key_hygienesnake_case keys, no duplicates, consistent nesting depth across records.
source_alignmentFields match what the source type should contain (CCRIS facilities, EPF contribution history, SSM directors, etc.).
no_hallucinationcriticalEvery value in the JSON is traceable back to text in the PDF.
unit_normalizationAmounts carry currency codes; all units consistent within the document.
internal_consistencycriticalTotals equal the sum of line items. Dates fall inside the document's reporting period.
identifier_presencecriticalAt least one strong identifier captured so the record can be matched to a subject.
Mistakes on CCRIS move loan decisions. Mistakes on SSM rarely do. Thresholds reflect that asymmetry and are recalibrated weekly against a human-labeled gold set.
| Source | Issuer | Overall min | Critical min |
|---|---|---|---|
| CCRIS | BNM (MY) | 95% | 99.5% |
| SLIK | OJK (ID) | 95% | 99.5% |
| EPF / KWSP | KWSP (MY) | 92% | 97% |
| LHDN | LHDN (MY) | 92% | 97% |
| DJP | DJP (ID) | 92% | 97% |
| SSM | SSM (MY) | 90% | 97% |
| BPJS | BPJS (ID) | 90% | 97% |
Free tier includes 100 documents/month. No credit card. Production-grade from request one.
Get your API key