Select OCR Settings from a Drill-through Search of OCR Candidates

Drill-through of entry in OCR Candidates Report from a Data Set Report or All Imports Report > OCR

Requires Imports - Add/Edit Permissions

If you have the appropriate permissions, you can perform OCR processing of OCR candidates after import. This involves performing a drill-through search from the appropriate entry in the OCR Candidates Report for a Data Set or all Imports and then using the OCR option available from the drill-through results in the Navigation Tree or the document list to specify OCR settings for the operation, including language, accuracy, and timeout settings.

By default, OCR processing can automatically detect all languages that fall under the General category. This includes all Latin languages and Chinese, Japanese, and Korean (CJK) languages. The OCR software does not automatically detect Arabic, Cyrillic, Greek, Hebrew, or Thai. To have a given OCR operation identify and process Arabic, Cyrillic, Greek, Hebrew, or Thai, you must select the appropriate language type for the given OCR operation. You can also select English if you want to run English-only OCR.

Note: OCR processing and reprocessing are not permitted if any of the documents are from a Shared (public in the Organization) Data Set. Once a Data Set is Shared, it is owned by the Organization.

OCR Settings

  • Language Selection — Enables you to use a drop-down menu to select one of the following for a given OCR operation:
    • General (the default), which accommodates all detectable languages, including all Latin languages and Chinese, Japanese, and Korean (CJK) languages. See the OCR Language Support section of How to Perform OCR Processing for a list of the languages in the General category.
    • Arabic
    • Cyrillic (which includes Bulgarian, Macedonian, non-latin Serbian, Russian, and Ukranian)
    • English (for English-only OCR)
    • Greek
    • Hebrew
    • Thai
  • Accuracy — Enables you to select the level of accuracy for OCR processing to either Medium (the default) or High.
  • Page Timeout — Sets the OCR page timeout. The default is 120 seconds. You can edit this value as long as you maintain a non-zero value. Negative values are not valid.

Actions

Select one of the following actions for the OCR Settings:

  • Run OCR — Click this button to request OCR processing. After clicking this button, you can go to the Work Basket and view the related tasks, a task validating that OCR processing can be done, followed by a running OCR task that enables you to track progress. Note that canceling an in-progress OCR task from the Work Basket (or using read/write Job Management, for Administrators with System Administration permissions) will not preserve any OCR text already generated. Therefore, you should evaluate the current progress before you cancel the task.
  • Cancel Click to cancel the OCR processing operation.