Back to Articles List

How do I edit document text after OCR has been performed?


I cannot edit the text after I have performed OCR on a document.


V8 and up

With the release of Version 8 of the PDF-XChange product line, we have included a new OCR Plugin which is able to perform this process for you automatically. This plugin requires its own additonal license coverage, which must be bundled with your existing PDF-Xchange Editor, Plus, Tools, and PRO license. If you hold a V7 license that has yet to expire, you are entitled to use the new OCR for free until they day your maintenance expires, or you purchase a license upgrade.

V7 and prior

Please note that in version 7 and prior, the main purposes of the OCR feature in PDF-XChange Editor was to make the text of scanned/image-based input documents searchable and selectable. When OCR is performed, PDF-XChange Editor identifies text-based content in input documents, then creates an invisible duplicate of it and inserts the duplicate over the original. This has the effect of converting image-based content into searchable/selectable text. The process detailed below can be used to edit the invisible duplicate layer of OCR output - but it is important to note that, as this is not one of the main purposes of the OCR feature, it may be problematic in some cases.



In V8, if your license covers the new OCR plugin, this will be enabled by default and noted in the title of the OCR dialog, you will simply need to select the Editable Text and images, or the Fine page content options and then click OK:

If the (enhanced) indicator is not present, you do not have the OCR plugin enabled. To enable this, you will need to ensure that:

1. Your license covers the OCR plugin If it does not, the option in step 2 will not appear.

2. The OCR options under preferences in the Editor and Tools are indeed set to "Enhanced"


Follow the steps below to edit the invisible duplicate layer of OCR output:

1. Open the document in PDF-XChange Editor.

2. Select the View ribbon tab, then click the Panes dropdown menu and click Content:

The Content pane will open.

3. Click the dropdown arrow of the page that contains the duplicate layer to be edited, then group-select the text-based content in the Content pane. The duplicate layer will highlight automatically in the document:

4. Select the Format ribbon tab, then click the Fill Color dropdown menu and select the desired color for the duplicate layer. This example will use red:

5. Return to the Content pane and delete the image-based content of the page. (This is the content on which the OCR operation was performed). Right-click images and then click Delete in the dropdown menu:

The process is then complete, and the text can be edited as normal:

Note that enabling the Highlight Text Blocks option in the Edit dropdown menu may assist in the editing of text:

Additionally, note that documents can be exported to and edited in alternative formats such as MS Word when the process detailed above has been performed.

Was this article helpful?
Yes No Somewhat