DIGI-XTRACT

A fully automated data extraction solution that can eliminate the need for human intervention.

A Document Processing solution built on the base of Machine Learning, Deep Learning technology to perform document classification, data extraction, and quality control applied to various document types.

DIGI-XTRACT supports multiple and can also be customized for special document types respective to the client’s business languages & requirements. The service can be securely and remotely hosted at DIGI-TEXX’s Data Center or simply deployed at the client’s premises using state-of-the-art technologies.

Supported form types: 

  • Structured forms: Application, ID/passport, birth certificate, bank statement, financial statement, payslip, invoice, receipt, tax invoice, bill, purchase order, quotation, delivery note, etc.
  • Semi-structured forms: Patient medical histories, field inspection notes, labor contracts, incident reports, work permits, confirmation letters, etc.
  • Unstructured forms: Handwritten letters, personal notes, personal journals, handwritten research, handwritten historical documents, examination sheets, catalogs, land register books, construction and architectural drawings, etc.

DIGI-XTRACT CORE COMPONENTS

The following functions can be used individually without the need to deploy the entire solution.

AUTO CLASSIFY

DIGI-XTRACT recognizes and classifies various document types automatically, using the Auto Classify component. It can accurately detect document types based on vectorization. The system then routes the document to the Auto Extract component to extract data for an optimized accuracy rate.

The Auto Classify function also classifies the quality of the input images, then DIGI-XTRACT will route the classified images to the Image Quality Enhancement function.

AUTO EXTRACT

Auto Extract includes Field Detection & Text Extraction.

With predefined data fields, the Field Detection component picks up the correct data field from the image and processes the extraction securely based on the snipped image. 

After Field Detection, depending on which field type a piece of information is, a corresponding data extraction engine will be used for extracting the text out of the snipped image.

With this method, full information on client documents will not be seen or shared by any third party.

Auto Extract produces a confidence score for each data field. The score can then be used to determine the quality of the extraction in the set of rules of the Auto QC component.

AUTO QC

Auto QC runs the quality control based on a complex scoring combination:

  • Common rules such as the format of IBAN Number, ID Card, Postal Code, Age, Gender, Date/Time, etc.
  • Business rules based on the client’s business domain
  • Data Field Relationships such as [age, gender, disease], [title, salary, business], [hospital, treatment, age, gender], etc.
  • Image Quality Analytics: clear/unclear, blurred, skewed, flipped, distorted, low resolution

With the traditional quality control approach, there are different methodologies with human involvement. With Auto QC, the process is broken down into data levels and tracked by metadata in various steps. The Auto QC runs through 100 percent processed data and points out potential errors.

With the score, Auto QC can detect the potential error and control the Straight-Through-Rate (STR) so that the system can decide to let the data go through or transfer it to the data correction step for quality enhancement.

AUTO CLASSIFY

DIGI-XTRACT recognizes and classifies various document types automatically, using the Auto Classify component. It can accurately detect document types based on vectorization. The system then routes the document to the Auto Extract component to extract data for an optimized accuracy rate.

The Auto Classify function also classifies the quality of the input images, then DIGI-XTRACT will route the classified images to the Image Quality Enhancement function.

AUTO EXTRACT

Auto Extract includes Field Detection & Text Extraction.

With predefined data fields, the Field Detection component picks up the correct data field from the image and processes the extraction securely based on the snipped image. 

After Field Detection, depending on which field type a piece of information is, a corresponding data extraction engine will be used for extracting the text out of the snipped image.

With this method, full information on client documents will not be seen or shared by any third party.

Auto Extract produces a confidence score for each data field. The score can then be used to determine the quality of the extraction in the set of rules of the Auto QC component.

AUTO QC

Auto QC runs the quality control based on a complex scoring combination:

  • Common rules such as the format of IBAN Number, ID Card, Postal Code, Age, Gender, Date/Time, etc.
  • Business rules based on the client’s business domain
  • Data Field Relationships such as [age, gender, disease], [title, salary, business], [hospital, treatment, age, gender], etc.
  • Image Quality Analytics: clear/unclear, blurred, skewed, flipped, distorted, low resolution

With the traditional quality control approach, there are different methodologies with human involvement. With Auto QC, the process is broken down into data levels and tracked by metadata in various steps. The Auto QC runs through 100 percent processed data and points out potential errors.

With the score, Auto QC can detect the potential error and control the Straight-Through-Rate (STR) so that the system can decide to let the data go through or transfer it to the data correction step for quality enhancement.

DIGI-XTRACT FEATURES

Automated extracting data from structured/unstructured/semi-structured forms/documents.

Image Quality Enhancement at the preprocessing step.

API gateway integration

Manual data entry elimination

Web Monitoring Services for real-time tracking and automatic reporting functions

High performance and quality

High availability of back-end processing systems

PROCESS OF THE PRODUCT

STRAIGHT-THROUGH PROCESS (STP)/AUTOMATION PROCESS

GUARANTEED PROCESS/AUTOMATION PROCESS WITH HUMAN TOUCH

DIGI-XTRACT IN DIGITIZATION JOURNEY

ACCURACY RATE

Our accuracy rate calculates a confidence score that measures the certainty of the extracted data from its original image. A higher accuracy rate, which is dependent on the quality of the assessed document, brings better data quality and supports analytical purposes. 

With DIGI-XTRACT, the accuracy rate is equipped with intelligent engines to ensure the quality meets the client’s expectations.

The accuracy rate can be measured by various units such as character, word, field, and line.

CLIENT SUPPORT

DIGI-XTRACT is supported and delivered by an excellent onboarding team partnered with our service management team.

All projects are monitored 24/7 by our Network Operating Center to ensure optimal service availability.  

DIGI-TEXX provides an end-to-end client experience from the first step of analysis to the final step of implementation and enhancement. On top of that, the service management team accompanies clients throughout the whole operation phase to ensure a smooth transition and successful delivery.

WHAT MAKES US DIFFERENT?

01

AUTOMATION WITH 24/7 MONITORING

Fully automated solution with no human intervention and a transparent process with Web Monitoring Services that provide data status for each step.

02

EASY INTEGRATION AND QUICK SETUP

Based on the client’s demand any customized transfer methods (Secure Transfer Protocols, API, Email) fit the client’s system. 2-4 week setup time.

03

FLEXIBLE PRICING MODELS

We offer various options allowing our clients to accommodate different segments and their specific requirements. Our flexible pricing model includes subscriptions, pay-as-you-go, and bundling.

CASE STUDIES

Data Extraction Solution for Customer Onboarding Straight-Through Process

BUSINESS CHALLENGES Our Client We serve a leading international insurance and financial services company with over 1.5 million customers operating in Asia, Canada, and the …

Read More

Learn more
INTELLIGENT DOCUMENT SCANNING 2

Intelligent Document Scanning Solution

The Intelligent Document Scanning Solution is designed for Customer Service (CS) at branches to process documents and detect appropriate document types,…
Learn more
Straight-Through-Process-for-Customer-Onboarding

Straight-Through Process for Customer Onboarding

An automatic solution when it comes to no manual intervention involved and driving operational efficiency
Learn more