OCR Workflow Configuration
OCR Settings – General Tab
The OCR Workflow configuration options specify the precise behavior of the OCR module as it executes through the workflow of the chosen document type. The OCR module also is used to convert image files from TIFF to PDF (Image Only) format without performing OCR; in short behaving like a PDF maker..
Engine Options
Enable image pre-processing (Deskew, Despeckle, etc.)
If image processing has not been performed in earlier steps, enable to enhance image quality and potentially improve OCR accuracy.
Enable auto rotation
Attempt to right images to potentially improve OCR accuracy, if not already done so in earlier steps.
User Dictionary
Enabling this option allows the user to add custom words to user’s own dictionary. This may be helpful when performing OCR on specialized documents such as medical documents.
Click “Setup” to add words.
Output Options
Output Type
Adobe PDF (Image Only)
Converts tif images to PDF without performing OCR.
Adobe PDF (Image with Hidden Text)
Performs OCR then stores the OCR text as hidden text within the PDF file.
Text
Performs OCR but outputs only the OCR result in a text file.
OCR File Tag
Enter a tag to associate with the OCR output file.
Output OCR as single page
Selecting this option produces each image as single page PDF; otherwise the output is a multipage file.
Include Folder Separators in Output
If data is included on the Folder Separator which is important to the user during Quality Assurance or Index but is NOT desired to be left in the output viewed by the end user; de-selecting this option will remove the Folder Separator sheet in memory before outputting the file.
Include Document Separators in Output
If data is included on the Document Separator which is important to the user during Quality Assurance or Index but is NOT desired to be left in the output viewed by the end user; de-selecting this option will remove the Document Separator sheet in memory before outputting the file.
Do not output items marked with Skip flag
Any page/document/folder tagged with a Skip Flag will not be included in the output.
OCR Settings – PDF Tab
PDF File Options
PDF Conformance
- PDF 1.4
- PDF 1.5+
- PDF/A-1b
Compress PDF
Create Linearalized PDF (Fast Web View)
Optimized file size by reducing images resolution
PDF Document Field Options
The standard PDF Document Fields are: Title, Subject, Author and Keywords. The user can select any System, Batch, Folder or Document index field to populate the desired PDF Document Fields inside the created PDF file.
Available PDF Document Fields:
- Title
- Subject
- Author
- Keywords
- Custom Fields
Enable PDF Custom fields
Selecting Setup will launch the following screen.
Move To
This option will select the specified row within PDF Custom Fields setup and is useful when working with a large number of data fields.
Select the check box in the Include column to place both the Field Name of that row and its corresponding data value in the Custom Tab of a PDF as shown below.
To the left is an example of custom fields that were included in a PDF document visible from the Document Properties within Adobe Reader.
OCR Settings – Text Tab
This tab becomes active when the selected output type is set to Text.
On this page:
Related Pages:
-
Page:
-
Page:
-
Page:
-
Page:
-
Page:
-
Page:
-
Page:
-
Page:
-
Page: