Skip to content

Tatoeba/tatodet

Repository files navigation

Tatodet for language detection/classification

This is a standalone python microservice to classify text into language classes using models trained on tatoeba's data

Requirements

Mainly pymc3, numpy, and sklearn. However many other packages are used and many more will be added in the future.

pip install -r requirements.txt

Models implemented

  • Heirarchical Bayesian model:
  • Beta with freq priors
  • Beta with Poisson priors

Webapi

Make sure the model is built:

python3 build_model.py

To run the webapi:

python3 api.py

then send a query:

curl -X GET "http://localhost:8080/v1/det?sent=what+is+it&trials=20"

Tests

To run tests:

python3 -m pytest

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published