Skip to content

adhikary97/One-Piece-Database

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

One Piece Database Project

Install dependencies

  1. Install Tesseract
  2. Install requirements.txt: $ pip install -r requirements.txt

Download all One Piece Chapters

Download One Piece

Warning this is a large zip file ~4GB

Text extraction

Remember to set your tesseract path in main.py line 9

For example:

pytesseract.pytesseract.tesseract_cmd = '/usr/local/Cellar/tesseract/5.1.0/bin/tesseract'

Text is extracted to <filename>.txt

For example img2.png will output to img2.txt

Example execution:

$ python main.py --file img2.png

Automate text extraction for all chapters

Make sure the One_Piece folder is in the root directory of this project. This will output all the .txt files in each folder per manga panel.

$ bash automate_extraction.sh

Releases

No releases published

Packages

No packages published