BUSINESS CHALLENGES
Our Client
We serve a leading international insurance and financial services company with over 1.5 million customers operating in Asia, Canada, and the United States. In the Vietnam market, their network of 80 offices provides financial advice, insurance, wealth management, and asset management services for individuals, groups, and institutions.
Project Challenges
Limited OCR’s Captured Capacity
The client’s current OCR engine is powered to capture ID cards, yet, the onboarding process is now open for various ID document types (ID cards, Passports, Birth certificates, military ID cards, etc.) This leads to the limitation of documents processed by the OCR engine, resulting in an increase in the human workforce for verification.
The current OCR capture capacity can not process various document types
Project Objective
- Shorten the document and data processing time for one document to < 1 minute
- Facilitate an end-to-end automatic approval process while ensuring data accuracy at the highest level.
Project Scope
Build a straight-through process for customer and agency onboarding by enhancing the OCR engine’s extraction capacity
- Document types:
- Identity documents (ID cards, Passports, Birth certificates, Military ID cards, etc.)
- Application forms
- Languages: English and Vietnamese
- Service time: 24/7
- Committed accuracy rate: 95%
SOLUTION
Data Extraction Solution
The quality of the input data plays a significant role in defining the output quality, therefore, DIGI-TEXX has developed a three-step data extraction with no human verification needed.
DIGI-TEXX applies Image Quality Enhancement technology in the pre-processing step to transform the images and make them more suitable for OCR engines and machine vision algorithms in later processing stages.
This technology will identify the key features and details of the images, then adjust them using professional digital image processing techniques like:
- Remove image background noise
- Adjust skew and rotation
- Crop the excess areas
- Tune the brightness, sharpness, and other color settings
Then, the processed documents will be sent to DIGI-XTRACT, a Document Processing solution built by DIGI-TEXX’s software development team.
DIGI-XTRACT is powered with Machine Learning (ML) and Deep Learning (DL) technology to enrich the OCR’s captured capacity to more documents like birth certificates, passports, military IDs, and bank statements.
Auto QC runs the quality control based on a complex scoring combination:
- Common rules such as the format of ID cards, Postal Code, Age, Gender, Date/Time, etc.
- Business rules based on the client’s business domain
- Data Field Relationships such as [age, gender, disease], [title, salary, business], [hospital, treatment, age, gender], etc.
- Image Quality Analytics: clear/unclear, blurred, skewed, flipped, distorted, low resolution.
BUSINESS OUTCOME
- Processing time per document is shortened from 3 minutes to 5 seconds/ document.
- Accuracy Rate: 60% to 97% (on field level)
- Enhance the client’s document processing capacity from 95,000 pages/month to 3 million pages/month
- The data output quality is no longer dependent on human