DOCUMENT PROCESSING SOLUTIONS
We collaborate with organizations, analyzing their technologies and processes, to design and implement solutions for processing and automatically verifying documents, extracting data to feed different systems, and developing mechanisms for the prevention and detection of fraud using the most innovative methodologies, which translate into operational and financial gains that contribute to an increase in quality and efficiency in organizational processes.
In this context, we develop intelligent document processing and verification solutions according to the specific needs of organizations in different business areas (eg Banking, Insurance, Utilities, Public Administration, Private, etc.).
The complexity of our customers’ needs requires a different and opposite approach to the one traditionally used by the IT industry.
Current solutions for mass scanning and document management by large organizations are still far from their perspective, due to their lack of robustness and the great need for human interaction so that business processes can evolve successfully.
This innovative offer of value-added services is based on three pillars, Machine Learning, Print & Scan channels, and Optical Document Processing (ODP).
We use ML for our document processing solutions. It is an area of artificial intelligence that aims to develop techniques that allow systems to learn from a set of data, which can be used to detect patterns, and groups (clusters), predict the future or classify samples.
For example, historical credit data can employ Machine Learning algorithms to detect patterns of fraudulent credit applications and apply the model to assist in the risk rating of each new application or use Machine Learning to correct errors after OCR application for scanned documents.
There are several Machine Learning and SVM (Support Vector Machine) algorithms, neural networks, and logistic regression, which are generally classified into two broad categories: Supervised Algorithms, when the data provided to test the algorithms are formed by a set of samples, and the class to which each sample belongs, such as the data requested in the credit applications, classified as fraud or no fraud and Unsupervised Algorithms, those in which the data samples are not classified as belonging to any class.
Choosing a particular algorithm requires adequate knowledge, taking into account the objective of the problem to be solved, its variation over time, the computational power, or the volume of data that are available.
“Machine learning is a field of study that gives computers the ability to learn without being explicitly programmed”
Arthur Samuel. 1959
PRINT & SCAN CHANNEL
The content of documents subjected to scanning and printing processes is subject to great distortions that significantly hamper the implementation of automated document processing solutions.
This binomial scanning and printing is known as the Print & Scan Channel and has been mathematically modeled using advanced signal processing techniques in communications, where the printing device is the transmitter, the paper the transmission medium, and the scanner, tablet, or smartphone the receiver.
The formulas that model the Print & Scan Channel are defined as a non-linear channel, varying in time, and space and with a high level of noise, which mathematically relate an original document with the printed and scanned version.
It is precisely this characterization that a priori defines the purely random behavior presented by the Printing and Scanning Channel, which allows the team of engineers to offer differentiated and unique added value compared to traditional document processing services, detecting signs of tampering in printed documents, or improve the interpretation of the content of a scanned document.
Choosing a particular algorithm requires adequate knowledge, considering the objective of the problem to be solved, its variation over time, the computational power, or the volume of data that are available.
OPTICAL DOCUMENT PROCESSING
Optical Document Processing (ODP) encompasses all technologies applied for automatic processing of all types of digital documents, from invoices to Citizen Card, for process automation in document management organizations flows with the need for mass processing of large volumes of documents and meeting deadlines. The main processes are:
- OCR on scanned documents.
- Automatic search of data in generic documents and automatic association of data to logical labels using Machine Learning techniques.
- Detection of anomalies in the morphological structure of documents, combining traditional image processing techniques with Machine Learning techniques.
- Detection of Anomalies in documents that were poorly standardized, combining OCR and Machine Learning techniques.
In this context, we provide services with different specificities regarding the handling of documentation, and we respond to various areas such as Public Administration, Finance – Banking buzzwords – Insurance Utilities, and Documental Expertise, of which we now describe the services.
The relationship between citizens and the Public Administration and its services is governed by the administrative procedure, which guarantees the principle of equality of all citizens before the Administration.
Public Administration is a vast and complex reality. Traditionally, Public Administration is understood in a double sense: organic sense and material sense.
In the organic sense, public administration is the system of bodies, services, and agents of the State and other public entities that aim at the regular and continuous satisfaction of collective needs; in the material sense, public administration is the very activity carried out by those bodies, services, and agents. Considering its organic meaning, it is possible to distinguish in Public Administration three large groups of entities:
- Direct administration of the State.
- Indirect administration of the State.
- Autonomous Administration.
A fundamental part of this communication process is the ability to streamline and improve document processes through the solutions we provide for this purpose. Today more than ever, the need for efficiency and effectiveness in Public Administration processes requires:
- Cost Reduction.
- Reduction in waiting times in processes.
- Time-saving in the processing of the file.
- Advancement in interoperability and information reuse and E-Government.
FINANCIAL | BANKING | INSURANCE | UTILITIES
Currently, and despite the telematic means available, the beginning of a contractual relationship between the customer and the supplier is based on the information contained in physical paper documents.
The provision of services as well as the supply of consumer goods that require the use of credit, through a financial entity that facilitates credit to the customer, a process that is carried out based on the quality and authenticity of the documentation received.
To avoid incurring unnecessary costs, it is essential to assess the feasibility of the operation in real-time, for this it is necessary to detect suspicious documents in the admission processes before they enter the risk analysis or authorization channel.
To respond to these specific requirements, we have developed solutions for the automation of these processes, acting from the moment of receipt of information, detecting suspicious documents in real-time, and validating identity documents and proof of address.
At the same time, the data required for the applications are automatically extracted and documents are digitized, avoiding recurring costs of manual data storage, error correction, visual verification of the documentation received, and losses due to fraud.
For these processes, the most advanced knowledge of “Machine Learning” is used, mathematical models of the distortions produced by the processes of printing and scanning documents (Print & Scan Channel) and “Optical Document Processing”.