Skip to content

Indic languages computing resources with a focus on Telugu

Notifications You must be signed in to change notification settings

ChillarAnand/likitham

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Likitham

This repo contains scripts and datasets for processing Telugu language data.

Scripts

Checkout module docstrings of individual scripts on how to use them.

Models

te.pyrnn.gz - Telugu language model(LSTM + CTC) trained with ocropy

Dataset

Sample training data. You can use scripts to generate customized training data.

Useful links

Telugu fonts

Telugu POS tagger

Isolated Handwritten Telugu Character Dataset

Telugu and other south asian language data

Corpus search engine

tessaract-te - Tesseract Open Source OCR Engine

banti_telugu_ocr - End to end OCR system for Telugu. Based on Convolutional Neural Networks.

Chamanti_ocr - Telugu OCR framework using RNN, CTC in Theano & Python3.

http://docs.cltk.org/en/latest/telugu.html

http://www.tdil-dc.in/index.php?option=com_download&task=showresourceDetails&toolid=264&lang=en

http://www.tdil-dc.in/index.php?option=com_download&task=showresourceDetails&toolid=1892&lang=en

http://ildc.in/Telugu/htm/lin_ocr_spell.htm

About

Indic languages computing resources with a focus on Telugu

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages