Skip to content

dangvansam/text-detection-recognize-ctpn-tesseract

Repository files navigation

text-detection-recognize-ctpn-tesseract

custom from repo: https://github.com/eragonruan/text-detection-ctpn with Tesseract text recognize for each detected box


setup

nms and bbox utils are written in cython, hence you have to build the library first.

cd utils/bbox
chmod +x make.sh
./make.sh

It will generate a nms.so and a bbox.so in current folder.


demo

  • download the ckpt file from googl drive or baidu yun
  • put "checkpoints_mlt/" in "text-detection-ctpn/"
  • put your images in "data/demo", output image and text in "data/res", and run demo in the root
python main/demo.py
  • struct directory:

  • text recognize with Tesseract:


training

prepare data

  • First, download the pre-trained model of VGG net and put it in data/vgg_16.ckpt. you can download it from tensorflow/models
  • Second, download the dataset we prepared from google drive or baidu yun. put the downloaded data in data/dataset/mlt, then start the training.
  • Also, you can prepare your own dataset according to the following steps.
  • Modify the DATA_FOLDER and OUTPUT in utils/prepare/split_label.py according to your dataset. And run split_label.py in the root
python ./utils/prepare/split_label.py
  • it will generate the prepared data in data/dataset/
  • The input file format demo of split_label.py can be found in gt_img_859.txt. And the output file of split_label.py is img_859.txt. A demo image of the prepared data is shown below.

train

Simplely run

python main/train.py
  • The model provided in checkpoints_mlt is trained on GTX1070 for 50k iters. It takes about 0.25s per iter. So it will takes about 3.5 hours to finished 50k iterations.

some results

NOTICE: all the photos used below are collected from the internet. If it affects you, please contact me to delete them.