text-detection-recognize-ctpn-tesseract

custom from repo: https://github.com/eragonruan/text-detection-ctpn with Tesseract text recognize for each detected box

setup

nms and bbox utils are written in cython, hence you have to build the library first.

cd utils/bbox
chmod +x make.sh
./make.sh

It will generate a nms.so and a bbox.so in current folder.

download the ckpt file from googl drive or baidu yun
put "checkpoints_mlt/" in "text-detection-ctpn/"
put your images in "data/demo", output image and text in "data/res", and run demo in the root

python main/demo.py

First, download the pre-trained model of VGG net and put it in data/vgg_16.ckpt. you can download it from tensorflow/models
Second, download the dataset we prepared from google drive or baidu yun. put the downloaded data in data/dataset/mlt, then start the training.
Also, you can prepare your own dataset according to the following steps.
Modify the DATA_FOLDER and OUTPUT in utils/prepare/split_label.py according to your dataset. And run split_label.py in the root

python ./utils/prepare/split_label.py

it will generate the prepared data in data/dataset/
The input file format demo of split_label.py can be found in gt_img_859.txt. And the output file of split_label.py is img_859.txt. A demo image of the prepared data is shown below.

Simplely run

python main/train.py

The model provided in checkpoints_mlt is trained on GTX1070 for 50k iters. It takes about 0.25s per iter. So it will takes about 3.5 hours to finished 50k iterations.

NOTICE: all the photos used below are collected from the internet. If it affects you, please contact me to delete them.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
checkpoints_mlt		checkpoints_mlt
data		data
main		main
nets		nets
text-detection-ctpn-banjin-dev		text-detection-ctpn-banjin-dev
utils		utils
.gitattributes		.gitattributes
README.md		README.md
cccd-1.png		cccd-1.png
demo.py		demo.py
hoadontiendien-3.png		hoadontiendien-3.png
rotate_cuted.png		rotate_cuted.png
rotate_cuted2.png		rotate_cuted2.png
rotate_img.py		rotate_img.py
rotated.png		rotated.png
rotated2.png		rotated2.png
skew_corrected.png		skew_corrected.png
struct.PNG		struct.PNG
tesseract_output.PNG		tesseract_output.PNG
ve-tau-dien-tu-1441345640163.jpg		ve-tau-dien-tu-1441345640163.jpg