Smart Zone Configuration

Overview

A Smart Zone uses OCR to search for information based on a common anchor point across documents of varying format. For example, you may have a mixture of invoices in varying formats. In this example we will use a label called “INVOICE NUMBER”, but the placement of the invoice number varies from form to form. The invoice number could be next to or underneath the label and may even be in different locations of each document (left side vs right side). Instead of defining multiple zones or multiple zone definition profiles, you can use one Smart Zone. It will find the common anchor point (i.e. the label “INVOICE NUMBER”), and then search in predefined areas around the anchor to find the actual value you need (i.e. the invoice number itself).

This article covers configuration of a basic Smart Zone. It also covers Grouped Smart Zones (which can populate multiple index fields) and Multi-Record Smart Zones (which are useful for extracting oddly formatted tables).

Basic Smart Zone

In this example, we’ll create a Smart Zone to locate the invoice number and store it in an index field.

  1. Create a new document type in PSIcapture. In the Index Data Fields configuration, create an index field called Invoice Number. Then in the Zone tab, press the Define Zones button:



  2. Press the Select Template Image button to load a document into the viewer. To draw a Smart Zone on the image, press the Draw Smart Zone button.



  3. Draw a Smart Zone over a large area of the image where you’re sure to find the Invoice Number label. This will display the Smart Zone Configuration screen:



  4. Press the anchor expression Add button to create a Smart Zone Anchor Expression. Enter a regular expression to locate the words “INVOICE NUMBER” on the document page. You can use the built-in regular expression builder by pressing the button. This will allow you to highlight one or more words and automatically build a regular expression to match them. Press the Locate button to locate the anchor on your page:



  5. In this example, we have two layouts for the invoice number: one is directly below the anchor, while the other is to the right side of the anchor. Define Child Zones for each of these areas by pressing the child zone Add button:



  6. Press the Save icon to save your changes and return to the Zone Configuration screen:



  7. Supply a name for your new Smart Zone (InvoiceNumberZone), then press the Save icon:



  8. Select OCR as the Zone Definition Action and assign the zone to your index field using the Zone drop-box. Also, check the Don’t run Zone OCR if field is already populated box to stop processing the child zones once a value has been found.

When we run a batch, the Smart Zone will pick up the invoice number from either of the two child positions you defined. 

Beside the label:

 

Under the label:


Grouped Smart Zones

The basic smart zone can only populate one index field. However, there are often multiple values you wish to capture into different index fields that are all anchored by a common anchor point. In these cases, use a grouped smart zone. A grouped smart zone has group names associated with each child zone. You can then assign individual child zone groups directly to an index field.

In the example below, we have one anchor that looks for the text “Prepared by”. This anchor has five child zones. Two of the zones are for address information, which can appear in two places based on the document’s format. The remaining three zones are for the phone, FAX and email information.

On the Index Data Fields configuration screen, the user can now select each of the individual child zone groups for assignment to an index field:

In the example above, the address will populate from one of the two address child zones. Phone will receive its value from the PreparedByZone.Phone child zone, fax from PreparedByZone.Fax, and email from PreparedByZone.Email.


Multi-Record Smart Zones

The multi-record smart zone is used for table extraction when a table is oddly formatted. For example, if a page is made up of multiple tables, it can be difficult to set up a set of standard multi-record zones to extract the line items correctly. Instead, create a single smart zone that covers just one column, enter a regular expression to detect that column’s values, and then set up grouped child zones to extract each line item’s details.

In the example below, the smart zone anchors on the quantity column, detect single, numeric values. Whenever one is found, the child zones gather the data from the line item:

In order to generate the multi-record smart zone, you must check box labeled Create a record for each Smart Zone found on the page, just above the child zone area.

After saving the zone configuration, the child zones can be selected and assigned to each individual index field: