Production capture encompasses a complex flow of processes that includes scanning but extends much further. In general, production capture includes six operations: document preparation, scanning, recognition, indexing and data validation, QC and rescanning, and release.
Our focus is not simply technology, but on solutions that meet our clients specific needs for increased productivity, security and cost savings. We work with our clients to develop complete document management strategies that integrate into their current environment with efficiency and ease.
OCR can also be generalized in two forms: text-over and text-under. Text-over is when the OCR data is placed over the image and the text becomes very clean and no longer appears as the original print. Text-under is when the OCR data is placed underneath an image and that data is placed on the same x-y coordinates as the original text. Text-under is used when keeping the original look of a document is required.
Image cleanup is also performed in the recognition step. Techniques include:
- Deskewing, despeckling, deshading, streak removal, and other basic cleanup functions
- Line removal and character reconstruction for use on forms
- Edge enhancement, which sharpens character edges to increase OCR accuracy
The purpose of image cleanup is to remove unwanted noise that can decrease the accuracy of automated recognition.
2018. Federal Agencies Digital Guidelines Initiative. Retrieved from http://www.digitizationguidelines.gov/