"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
TIP 139: Creating Editable Text from an Image PDFA scanned or image PDF is only an image of a page, and you can't manipulate its content by extracting images or modifying the text. However, Acrobat can convert the image of the document into actual text or add a text layer to the document using optical character recognition (OCR). Be sure to evaluate the captured document when the OCR process is complete to make sure Acrobat interpreted the content correctly. It is easy to confuse a bitmap that may be the letter I with the number 1, for example.
Converting a bitmap of letters and numbers into actual letters and numbers may result in items that can't be definitively identified, known as suspects. First take a quick look at the job ahead. Choose Document > Recognize Text Using OCR > Find All OCR Suspects. All content on the page that needs confirmation is outlined with red boxes (Figure 139c). The sample document was captured using the Formatted Text & Graphics option. Figure 139c. Show all the capture suspects to evaluate the conversion![]()
![]()
Figure 139d. Confirm or modify suspect entries in this dialog.![]() Figure 139e. Depending on the characteristics of the document and the conversion settings you choose, the results can be dreadful.![]() |