Businesses plagued by manual and data-centric tasks fail to reach their full potential. Historically, companies with ineffective automating document processing have suffered from unproductive workflows and fallen short of their goals.
The emergence of cognitive document automation (CDA) has helped organizations create a seamless workflow through “Automated Capture”. Automated Capture allows us to manage documents and emails effectively, and best leverage information from within documents.
That said, businesses still have to face several challenges in document management because of limitations in CDA solutions. In this article, we will directly address these challenges and suggest tips on how to build a strong foundation for automated capture.
The source of our images can easily affect their quality. While this may seem trivial to us, it adversely affects the precision of our classification and the accuracy of extraction. In other words, documents such as faxes will have lower image quality compared to an originally digital PDF.
At the same time, all scanners have different levels of scanning ability based on vendor quality depending on the vendor and model.
Some image file types have better inherent quality than others. 300 dpi gifs are most common, but often, companies can’t control the file type received from external sources. Lower-resolution images will have lower levels of classification and extraction accuracy (300 dpi is considered ideal).
The saying “garbage in, garbage out” also applies to CDA. Images faxed multiple times; mobile images with skew, tilt, blur, similar background or bad lighting; monochrome scans; documents with stamps, scribbles, and stains…all of these can affect classification and extraction accuracy. Images acquired by CDA solutions should be image-processed and perfected before applying automated classification and extraction to ensure maximum possible accuracy.
The number of samples and their similarity to the real world also impacts accuracy. Generally speaking, the more samples that are “machine-learned” by the CDA solution, the better. The number of samples required ranges from a few to hundreds, depending on the type of document. Samples should reflect as closely as possible what will be seen in the “real world” during production processing.
Structured forms generally have the highest level of classification and extraction accuracy, and require the fewest number of trained samples. Nonetheless, the form design will have a significant impact on accuracy—from the proximity of fields to each other to field boxes vs. letterboxes to field shading (if any). If your organization has control over the form design, make sure it’s laid out for maximum automation potential.
Semi-structured documents (such as invoices, purchase orders, sales orders, and bills of lading) generally show lower accuracy than structured forms. Different CDA solutions have different approaches for locating the desired data, and some are more reliable than others at finding the data and extracting it successfully. These documents also tend to have embedded tables (e.g., invoice line items), multiple tables, or tables within tables that may have lower extraction accuracy rates than regular fields.
Unstructured documents such as emails (body), letters, and contracts are the most challenging to classify and extract automatically. AI-based technologies such as natural language processing (NLP) have improved extraction accuracy rates for these types of documents in recent years.
The type of print on the document also affects extraction accuracy rates. Generally, machine-printed fields have the highest accuracy rates, followed by hand-printed fields and then by cursive fields. For machine print, font type and character spacing also impact accuracy rates. Document language can also impact accuracy rates. OCR engines used by CDA solutions exhibit varying OCR accuracy depending on the language, with Latin languages typically claiming the highest accuracy rates.
Barcode and checkbox fields typically show the highest extraction accuracy on a document. It’s not uncommon for CDA solutions to boast an accuracy percentage in the high 90s for extracting barcode values and checkbox/bubble values. However, there are dozens of barcodes in use, including 1D, 2D, and now 3D barcodes (2D with color), so make sure the CDA solution supports the most frequently encountered ones.
One of the primary reasons paper is still in use by many organizations is the requirement for a signature, and the paper signature must be captured, classified, and extracted. Moving to electronic signatures can remove the need for paper scanning, and this improves the productivity and capacity of your CDA users. Consider whether you simply need signature presence detection, or signature verification and fraud detection, as well.
A CDA solution’s classification and extraction accuracy rates can significantly improve through the use of databases. By matching similar content in databases, minor OCR errors can be ignored. The result? Less human involvement to confirm/correct low-confidence OCR results. Database content can include customer names, account numbers, ERP data such as PO number or vendor name, word dictionaries specific to industries or languages, etc.
Rules can also be used to increase the extraction accuracy of a field. For example, checking that subtotal plus tax equals total is a simple rule that can flag any errors, even after a human corrects one of those field’s values. Formatting rules are also a simple way to ensure high field accuracy (e.g., a social security number should always have the format xxx-xx-xxxx, where x is a number between 0 and 9). Checking for field value checksums also increases field extraction accuracy.
CDA solutions aren’t complete without an easy way to send the documents and data to the systems, processes, and people who need them. User productivity decreases immensely if users must manually move document images and data from one system to another. Remember that an RPA robot can automate the process of moving and aggregating data between systems if an out-of-the-box connector for the destination system isn’t available.