How to improve the accuracy of OCR results with image preprocessing techniques

Tram Ho

Source: Gekko Lab (Medium)


Some keywords:

Image preprocessing : image preprocessing
OCR : Optical Character Recognize


Foreword

In some real-life cases, it is quite difficult to improve the accuracy of OCR results by setting a high-quality standard for the input image. Therefore, we can use some additional image preprocessing techniques to improve the quality of the input image and thereby, the results of OCR can be improved. And this is the translation – my first post, if there is a problem or something is wrong, everyone can post comments and suggestions under the comment section of the article. Thank you very much.


Image preprocessing techniques

1. Scaling image size

As everyone knows, determining the image resolution is one of the important factors to improve the accuracy of OCR results. Therefore, first in image preprocessing, it is necessary to set the input image ratio to at least 300 DPI (dots-per-in).

Source: https://insacmau.com/do-phan-giai-dpi-la-gi/
Source: https://insacmau.com/do-phan-giai-dpi-la-gi/

2. Image geometric transformation

When the original image is a captured or scanned image from the camera, it is often tilted and not the corners of the object on the image are not rectangular ( example image below ). apt
Source: https://baoxinviec.com/su-dung-bang-gia-di-lam-lieu-co-don-gian/

For example in this case, segmentation between lines and characters is often messy and it will reduce the output of OCR extraction. And one of the techniques that can solve that problem is geometric transformation issues, referring to this term, we have 3 types of related algorithms:

  • Page rotation:
    Almost all OCR techniques are built-in function (built-in function) to classify text orientation lines (this word I don’t know how to translate into Vietnamese so that it’s easiest to understand, please understand) on image. A page is rotated to be classified, the rotation operation will automatically rotate correctly before performing text recognition.
  • De-skewing:
    Skewness is a common problem with scanned images. A general technique that can be used to classify angles on an image is to perform matrix computations. The image below will easily illustrate the results of the lines on the image after going through the De-skewing process.apt Source: https://medium.com/@Gekko_lab/make-your-ocr-results-more-accurate-part-ii-preprocessing-3d212ae16191
  • Perspective transformation In fact, if the scanned document is not positioned parallel to the camera, the image will be subjected to perspective distortion technique. According to the author’s actual experience, it is very difficult to transform objects in the image to be parallel, so here it is possible to consider the relativity to conduct preprocessing steps and other steps. other process.

apt
Source: https://www.researchgate.net/figure/Captured-high-resolution-image-of-the-desktop-including-a-document_fig4_227943304

  • Line straightening If a scanned document is a page of a book, the lines of text that appear in the scanned document image will be curvy. Curvy text lines have to reduce line segmentation and text alignment accuracy, so this technique is one way to improve the OCR repo.
  • Image binarization Image binarization means converting a color image (usually RGB) to gray (including two colors white – black). It increases the contrast between the background and the text. As well as increasing OCR accuracy, image binarization can reduce the size of the image to improve the processing speed of the OCR technique.apt
    Source: https://medium.com/@Gekko_lab/make-your-ocr-results-more-accurate-part-ii-preprocessing-3d212ae16191
  • Noise removal
    Most of the pictures taken are usually of random image quality, which is caused by the device or the shooting source is not fixed. OCR techniques can recognize texts from input images with high accuracy even without the need for image preprocessing. However, using some basic image smoothing techniques to remove noisy pixels and thereby improve the accuracy of OCR.apt
    Source: https://medium.com/@Gekko_lab/make-your-ocr-results-more-accurate-part-ii-preprocessing-3d212ae16191
    There are several ways that noise can be reduced:

    • Speckle noises
    • Blurred texts
    • Camera ISO noises

This is my first translation, if there are any mistakes, I hope to hear from everyone.

Share the news now

Source : Viblo