Adobe Acrobat - Applying OCR to Scanned Documents

This article explains how to use Adobe Acrobat to apply Optical Character Recognition (OCR) to scanned documents, making them searchable and editable.

Understanding OCR in Adobe Acrobat

Scanned documents are often images, making them unsearchable and difficult to edit. Optical Character Recognition (OCR) is a process that analyzes the image of text and converts it into actual text characters. This allows you to search for specific words, copy text, and even edit the content of the document directly within Adobe Acrobat.

Steps to Apply OCR in Adobe Acrobat

  1. Open the scanned document in Adobe Acrobat.
  2. Initiate OCR: There are several ways to initiate OCR, depending on your Acrobat version:
    • Method 1 (Acrobat Pro DC and Standard DC):
      • Navigate to All Tools > Scan & OCR > Recognize Text.
      • You may see two options: "In This File" or "In Multiple Files." Select "In This File" for a single document.
      • If Acrobat detects the document is already OCR'd, it may ask if you want to "Re-Recognize Text." Click "Yes" to proceed.
    • Method 2 (Older Acrobat Versions):
      • Go to Tools > Text Recognition > Recognize Text.
      • Again, choose "In This File" if prompted.
    • Method 3 (If Acrobat Prompts Automatically):
      • Sometimes, upon opening a scanned document, Acrobat will automatically detect that it needs OCR and prompt you with a message at the top to "Recognize Text." Simply click that prompt.
  3. Choose OCR Settings (Optional but Recommended but might not be available with your version of Acrobat):
    • In the "Recognize Text" panel, click on the "Settings" button to open the "Recognize Text Settings" dialog box.
    • Primary Language: Select the language of the text in your document. This is crucial for accurate OCR.
    • PDF Output Style: Choose the desired output style:
      • Searchable Image: Creates a searchable PDF with the original image intact and a layer of text added for searching. This option preserves the original appearance of the document.
      • Searchable Image (Exact): Similar to "Searchable Image," but aims for a higher level of precision in maintaining the original appearance. It might be slower.
      • Editable Text & Images: Creates a fully editable PDF. This option may slightly alter the original appearance to make the text editable. Use this if you plan to extensively edit the document's text.
    • Downsample Images: You can choose to downsample the images to reduce file size. This can be useful for large scanned documents.
    • Click "OK" to save the settings.
  4. Run OCR: Click the "Recognize Text" button to begin the OCR process. Acrobat will analyze the document page by page.
  5. Review and Correct (Important):
    • Even with accurate OCR, errors can occur, especially with low-quality scans or unusual fonts.
    • Carefully review the document after OCR. Use the "Find" (Ctrl+F or Cmd+F) feature to search for specific words or phrases.
    • If you find errors, you can edit the text directly in Acrobat (if you chose "Editable Text & Images").
    • For more complex corrections, especially if the document is not editable, consider using the "Touch Up Text Tool" (Tools > Edit PDF > Touch Up Text). This tool allows you to adjust the position and appearance of individual text elements.
  6. Save the document as a PDF. It's a good practice to save it with a new name to preserve the original scanned file.

Tips for Better OCR Results

  • High-Quality Scans: Use a scanner with a high resolution (300 DPI or higher) for better accuracy.
  • Clean Documents: Ensure the original documents are clean and free of smudges or creases.
  • Straighten Images: If the scanned image is skewed, straighten it before running OCR. Acrobat has tools for rotating and correcting perspective.
  • Choose the Correct Language: Selecting the correct language significantly improves OCR accuracy.
  • Proofread Carefully: Always proofread the document after OCR to catch and correct any errors.

Troubleshooting

  • Acrobat is not recognizing text:
    • Ensure the correct language is selected.
    • Try increasing the scan resolution.
    • Make sure the document isn't overly complex or using unusual fonts.
    • Try a different PDF Output Style.
  • OCR is slow:
    • Downsample the images.
    • Close other applications to free up system resources.
  • Text is garbled or unreadable:
    • The original scan may be too low quality for accurate OCR. Try rescanning at a higher resolution.
    • The font may be too unusual for Acrobat to recognize.

Need Additional Support?

If you have any questions or need further assistance, please contact the ITS Help Desk:

This guide aims to provide useful information, but as technology changes, interfaces or steps might vary. Please use the Comment button to let us know if anything differs from your experience. Your feedback helps us keep this information accurate. Thank you!



Keywords:
OCR, Optical Character Recognition, Adobe Acrobat, scanned documents, text recognition, searchable PDF, clearscan, editable PDF 
Doc ID:
148773
Owned by:
Jeff P. in Southern Illinois University Edwardsville
Created:
2025-03-04
Updated:
2025-03-11
Sites:
Southern Illinois University Edwardsville