This is a basic rest API service made with python flask with an easy Swagger interface made with flask_restx. It handles some different Natural Language Processing functions such as:
- correction, using
- tokenization, divide sentence word-token list or sentences in senteces-list. Using
- NLTK, a powerfull set of natural language tools.
- lemmantization, using
- spaCy, for me the best library, at the moment, that contains a full set of tools to manage words tokenization, words lemmatization, deep text analysis, part-of-speech and named entity recognition. The real problem for these types of library is the multi language affidability. And spaCy is very reliable.
- stemming, take the root of a word, using
- NLTK, with SnowballStemmer algorithm
- deep analysis, that provides:
- Part-Of-Speech detection
- Language detection
- Named Entity Recognition
- Morphology detection
- Sentiment analysis
- Python 3.8.10
- Docker (if you want to use the nlp-rest docker image)
- I reccomend to build and use a python virtual environment to not modify your own global environment (note I'm using VSCode on Ubuntu 20.04 OS). To do this, once cloned the .git repo, open VSCode and in its terminal, type
python3 -m venv /path/to/new/virtual/environment
Once you've created the virtual environment, VSCode should prompt a message for select it as default python interpreter. If not with the VSCode command prompt (in my case Ctrl-Shift-P
) choose Python: select interpreter
and next select the interpreter in your virtual environment
Then, in the VSCode terminal activate the virtual environment typing
source /path/to/new/virtual/environment/bin/activate
For more informations about virtual environment you can have a look here
- You need to set properly your virtual environment installing all the dependencies : there is a setup.py file, open it and have a look. Once you're in the virtual environment, type
pip install .
There is an image on DockerHub, you can use it, typing
sudo docker pull gigadr3w/nlp-rest
and run it, typing
sudo docker run -p "5000:5000" --name nlp_rest gigadr3w/nlp-rest
At the moment it handles these language dictionaries:
- English
- French
- German
- Italian
- Spanish
Once started at your local, you can navigate on the default main url (i.e. http://localhost:5000) and then use the swagger interface as documentation.