Extract the text from an image by using the OCR (Optical Character Recognition) Text component in RPA Desktop Design Studio.

Before you begin

Role required: none

About this task

Important:

Starting with the Yokohama release, the RPA Desktop Design Studio utilizes the latest version of the Tesseract OCR engine. This update improves image pre-processing, and performance optimizations. When updating older automations that include the OCR text component, you may notice slight differences in the output. Therefore, it’s important to validate your automations after the update.

Many properties of the OCR Text component are common with other Actions UI components, to configure these properties, see Properties of Actions (UI) components.

The properties unique to the OCR Text component are given.
Table 1. OCR Text component properties
Property Text
Image Source Source from which the component takes the image.

Procedure

  1. In the Toolbox pane, navigate to Actions (UI) > OCR Text.
  2. Drag the OCR Text component to the Design surface.
  3. (Optional) To configure the settings, click the component settings icon (Component settings icon.).
    The component has default settings that you may review and use.
  4. (Optional) Configure the settings as described in the following table.
  5. To close the OCR Settings window, click OK.
  6. To configure the input, see Configure port properties.
  7. To configure the output, see Configure output port properties.
  8. (Optional) Connect the ports as described in the following table.
    Port Type Port name Data type Purpose Notes
    Data In Image/File Path Bitmap/String Takes the image or the path to the image. The input depends on the option selected from the Image Sourceoption in the Properties.
    • Port: The data type is Bitmap.
    • File Path: The data type is String.
    Data Out Text String Returns the extracted text from the image.
    Data Out Confidence Single Returns the extracted text accuracy figure.
  9. To test the component, right-click the component bar and then click Run From Here.

Example: Extract text from images and display with the Show component

Extract text from image and display with the Show component.

The OCR Text component takes a path to an image. The image comprises the text "servicenow". The component extracts the text from the image and passes the string to the Show component (To use the Show component, see Use the Show component). The Show component takes the text through the Message Data In port and then displays the text in a window.