Binarization of image files

Why binarization can improve OCR results

Binarization in Optical Character Recognition (OCR) means converting a grayscale or color image into a black and white (binary) image. Ideally, this conversion applies a threshold value to clearly distinguish between foreground (text) and background (paper and stains) to improve OCR accuracy.

Performing batch binarization with Kraken

There are several ways to binarize images. One option is to use the Kraken OCR tool, which also allows you to binarize an entire folder of images at once. The binarization package for Kraken was developed in 2014 and 2015 by Benjamin Kiessling and Thomas M. Breuel, licensed under the Apache License, Version 2.0 (the “License”):

https://github.com/mittagessen/kraken/blob/main/kraken/binarization.py

To use the package for batch binarization, you have to put all the image files into a single folder. Then you can use the kraken.binarization.nlbin function (in Python) to convert the images. Running the Python code for Kraken in a virtual environment is highly recommended. To enter your virtual environment (on Linux and Mac), use source ~/kraken-env/bin/activate in the terminal, replacing kraken-env for your own environment name. Then you can run the Python code with the following command:

python batch_binarize.py

The Python file itself should contain this code:

# import packages for image processing

import os
from pathlib import Path
from PIL import Image
from kraken import binarization

# define paths for input and output files
input_dir = Path.home() / "/home/monikab/Documents/IMG_automobiles" # enter your own path to your image folder here
output_dir = input_dir / "IMG_binarized" # enter your own output path here
output_dir.mkdir(exist_ok=True)

# define permitted image formats
extensions = (".jpg", ".jpeg", ".png", ".tif", ".tiff")

# process all images in folder
for img_path in input_dir.glob("*"):
    if img_path.suffix.lower() in extensions: # only process files with permitted file extension
        try:
            print(f"[+] Binarizing {img_path.name}") # status info will be shown in terminal
            img = Image.open(img_path)
            bin_img = binarization.nlbin(img)
            out_path = output_dir / img_path.name
            bin_img.save(out_path)
        except Exception as e:
            print(f"[!] Failed to process {img_path.name}: {e}") # status info will be shown in terminal

print("\n Binarization complete. Files saved in:", output_dir) # status info will be shown in terminal

As you can see in the code, the new file will be saved to the output folder you define as òutput_dir`. You can then also use these images within other OCR software.

Problems with binarizing low-res images

https://stackoverflow.com/questions/73971444/binarize-low-contrast-images