DIBCO: Document Image Binarization Competition using BCDU-Net to achieve best performance

This GitHub page contains my implementation for DIBCO challenges in Document Image Binarization. The implemented code uses the BCDU-Net model for learning the binarization process on the DIBCO series. The evaluation results show the BCDU-net chan achieves the best performance on DIBCO challenges. If this code helps with your research please consider citing the following paper:

R. Azad, et. all, "Bi-Directional ConvLSTM U-Net with Densely Connected Convolutions ", ICCV, 2019, download link.

Updates

December 3, 2019: First release (Complete implementation for DIBCO Series, years 2009 to 2017 datasets added.). Other datasets can be added easily.

Prerequisties and Run

This code has been implemented in python language using Keras library with TensorFlow backend and tested in ubuntu OS, though should be compatible with the related environment. following Environment and Library needed to run the code:

Python 3
Keras - tensorflow backend

Run Demo

For training deep model for each DIBCO year, follow the bellow steps:

DIBCO Series

1- Download the DIBCO datasets from this link and extract it. We included DIBCO datasets from 2009 to 2017. It is easy to add DIBCO 2018, 2019 or other datasets, just need to revise the utils code.
2- Run Prepare_DIBCO.py for data preparation and dividing data to train and test sets. Please note that this code will consider whole the samples of one particular year as a test set and the rest of the years for the training set. It is the common data division which uses in the DIBCO challenge.
3- Run Train_DIBCO.py for training BCDU-Net model using trainng and validation (20% of the training samples) sets. The model will be training for 100 epochs and it will save the best weights for the validation set.
4- For performance calculation and producing binarization result, run Evaluate.py. It will represent performance measures and will save related figures and results in output folder.

Notice

We train the model using patches that we extract from the training set. Also for test image binarization, we apply patch-based overlaping binarization. If you want to train and evaluate the model of any particular year just determine the test year (parameter Test_year = 2016) when runing Prepare_DIBCO.py and Evaluate.py.

Quick Overview

Structure of the Bidirection Convolutional LSTM that used in BCDU-Net network

Results

For evaluating the performance of the BCDU-Net model on the DIBCO series, we followed the setting used in SAE. In [1] the authors provided the experimental results on only DIBCO series 2014 and 2016. To do so, they have considered the whole samples of one particular year (for example 2014 or 2016) as a test set and rest of the samples from other years as a train set. We used the same setting for reporting our results. In bellow, the results of the proposed approach illustrated in terms of F-measure.

Methods	DIBCO 2014	DIBCO 2016
Otsu	91.56	73.79
Niblack	22.26	16.7
Sauvola	77.08	82.00
Wolf et al.	90.47	81.76
Gatos et al.	91.97	74.97
Sauvola MS	87.86	65.04
Su et al.	95.14	90.27
Howe	90.00	80.64
Kliger and Tal	95.00	90.48
CNN	81.23	54.58
SAE	89.12	85.27
R. Azad BCDU-Net	97.88	98.96

Document Image Binarization Results on DIBCO Series

Model weights

You can download the learned weights for DIBCO series, which trained on DIBCO 2009-2017 (except 2016) samples. Please note that the trained model can be used for other years too.

Test year	Learned weights
DIBCO 2016	Model Weights

Query

All implementation done by Reza Azad. For any query please contact us for more information.

rezazad68@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
images		images
Evaluate.py		Evaluate.py
Prepare_DIBCO.py		Prepare_DIBCO.py
README.md		README.md
Train_DIBCO.py		Train_DIBCO.py
models.py		models.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

Evaluate.py

Evaluate.py

Prepare_DIBCO.py

Prepare_DIBCO.py

README.md

README.md

Train_DIBCO.py

Train_DIBCO.py

models.py

models.py

utils.py

utils.py

Repository files navigation

DIBCO: Document Image Binarization Competition using BCDU-Net to achieve best performance

Updates

Prerequisties and Run