Skip to content

Language identification toolkit for identifying what language a document is writen in

Notifications You must be signed in to change notification settings

nikhil-iyer-97/Language-Identifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Language-Identifier

The report contains the link for the dataset and the details about the paper used for implementing this task. The list of languages used for training the model is present in list of languages.pdf.

To add more languages to the data for training, just append sentences of the new languages to ./data/dataset.csv. Run preprocess.py in src directory to create a pickle file which contains the preprocessed input. Run lang_identify.py in src directory and give the file location as input of your chosen language.

About

Language identification toolkit for identifying what language a document is writen in

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages