Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

OCR is an acronym for Optical Character Recognition. The core of the OCR module is a ‘recognition’ process: the OCR engine software attempts to read each and every letter/number/word on an image, and write it out to various file formats. The results of OCR depend on multiple aspects of the image quality. Image quality generally stems from the paper Paper quality, print type (, font style), and print quality of the original . One document can affect image quality and thus the results of OCR. One of the most widely used file formats on the market today is the Adobe Portable Document Format (PDF)Adobe PDFs.

The OCR module Module has the ability to rapidly convert to and output scanned images as PDF files, with the OCR as hidden text in each PDF. The results of the full text OCR module can be editable or non-editable files in a file format and directory of the user’s choice. Supported OCR Output Types include: PDF (Image Only, Image w/ Hidden Text, Normal), HTML 4, Lotus 1 2 3, Lotus Ami Pro, MS Excel 97-03, MS Word 97-03, MS RTF, Text (Plain, Comma Delimited, Formatted, Tab Delimited, Line Breaks), and WordPerfect 8/9/10. The OCR result files are stored in the batch folder , and their existence has no effect on the TIFF images. These files are just another group of files created when this step is added to a capture workflowfor future reference.

In this section:

Page Tree
root@self

Related Pages:

Content by Label
showLabelsfalse
max10
showSpacefalse
cqllabel = "modules" and space = "CAPTURE"
labelsOCR module user