This API provides
- tagging with Neural Networks for INCEpTION (as external recommender)
- automatic feedback generation for CASUS using tagging with Neural Networks for EDA feedback and string-matching for content feedback
Currently the Neural Network used is a BiLSTM-CRF by Nils Reimers implemented in Keras and using the Theano backend.
However, any other network can be used by creating a class <yournetwork>Wrapper that inherits from your network and the abstract class neuralNets/network.py
.
Use neuralNets/BiLSTMWrapper.py
as an example.
neural-web receives a JSON for tagging that looks like this:
{
"metadata": {
"layer": "LayerName_String",
"feature": "FatureOfTheLayer_String",
"projectId": "InceptionProjectID_Integer",
"anchoringMode": "InceptionAchoringMode_String",
"crossSentence": "IsCrossSentenceTokenAllowed_Boolean"
},
"document": {
"xmi": "EscapedPlainTextXMI_XMLString",
"documentId": "InceptionDocumentID_Integer",
"userId": "InceptionUserID_String"
},
"typeSystem": "EscapedPlainTextXMI_XMLString"
}
and for training JSON is very similar but looks like this:
{
"metadata": {
"layer": "LayerName_String",
"feature": "FatureOfTheLayer_String",
"projectId": "InceptionProjectID_Integer",
"anchoringMode": "InceptionAchoringMode_String",
"crossSentence": "IsCrossSentenceTokenAllowed_Boolean"
},
"documents": [
{
"xmi": "EscapedPlainTextXMI_XMLString",
"documentId": "InceptionDocumentID_Integer",
"userId": "InceptionUserID_String"
}
],
"typeSystem": "EscapedPlainTextXMI_XMLString"
}
neural-web receives a JSON for tagging that looks like this:
{
"text": "<BASE64Encoded_UTF8String>",
"user":"<String_IDofUser>",
"case": "<String_numberOfCase>",
"api-key": "<givenToClientByUs>",
"proof": "<blake2bEncodingOfText>"
}
No training facility for CASUS implemented yet.
This needs to be done only once and may be ignored if nltk is already used previously with the system.
To install NLTK:
sudo pip3 install -U nltk
Once nltk is installed open terminal and go to the python3 shell, then use:
import nltk
nltk.download('punkt')
nltk.download('stopwords')
from nltk import word_tokenize,sent_tokenize
The last import command verifies if everything is installed correctly.
Setup a Python virtual environment:
virtualenv env --python=python3.5
source env/bin/activate
Install the requirements:
python -m pip install -r requirements.txt
If everything works well, you can run python app.py ${config}
to start the Server, ${config}
can be filled with values dev
or prod
Right now neural web has two versions hosted on server. One for production and the other for development, using ports 12889
and 12890
respectively.
Build the docker image from the root folder of the repository. Run:
sudo docker build ./docker/ -f ./docker/Dockerfile -t neural-web-prod
This builds a Python 3.5 image with additional dependencies form docker/requirements.txt
installed and assigns the name neural-web-prod to it.
To run our code, we first must create a container form the previously created image and start the container and mount the current folder ${PWD} into the container:
docker run ...
command first creates a container usingdocker create
and then starts it usingdocker start
sudo docker run --restart on-failure -d -it -v ${PWD}:/usr/src/app -p 12889:12889 -e host_config=prod neural-web-prod
The command -v ${PWD}:/usr/src/app
maps the current folder ${PWD} into the docker container at the position /user/src/app
. Changes made on the host system as well as in the container are synchronized. We can change / add / delete files in the current folder and its subfolder and can access those files directly in the docker container.
Here -p host-port:conatiner-port
. Please select these two ports according to your util/config.py
app.py will be executed automatically when the container has started.
All the previous constraints also apply to dev server.
To Build the image run:
sudo docker build ./docker/ -f ./docker/Dockerfile -t neural-web-dev
For creating and hosting use:
sudo docker run --restart on-failure -d -it -v ${PWD}:/usr/src/app -p 12890:12890 -e host_config=dev neural-web-dev
To rebuild and re-host:
- This should be necessary only when we have changed something in requirements.txt and/or Dockerfile
- Find the desired container using
docker ps -a
; will print a list of all running and stopped container. - Use
docker stop <param>
;<param>
might be either the CONTAINER ID or NAME - Then remove the container with
docker rm <param>
; again<param>
might be either the CONTAINER ID or NAME - Then remove the associated image using
docker rmi <param>
; now this will be image name i.e.neural-web-dev
orneural-web-prod
- Change source files (i.e. editing, pulling etc.)
- Build the image using method stated above; use
docker build ...
. - Create and run the conatiner with changed source as per above using
docker run ...
To only re-host:
- This should be sufficient only for source code modification
- Find the desired container using
docker ps -a
; will print a list of all running and stopped container. - Use
docker stop <param>
;<param>
might be either the CONTAINER ID or NAME - Change source files (i.e. editing, pulling etc.)
- Start the container using
docker start <param>
; again<param>
might be either the CONTAINER ID or NAME
CONTAINER NAME is given automatically by docker when a new container is created and multiple container can be created using same image
Paste your model(s) in the folder models
.
By defeault, all models in the models
folder are loaded upon starting the Server.
To load a model that was not in the folder, add it to the folder and send a POST request to localhost:12889/loadModel/<modelName>
and wait for the response.
<modelName>
is the name of the model without the .h5
extension.
To tag a document with the default model send a POST request to localhost:12890/api/inception/v1/tag
. Provide a JSON with the document in the body of the request as specified above.
An appropriate model
will be automatically chosen for tagging from request json
. If the determined model is not found will reply with a HTTP 404 response.
To train a model using some previously tagged and completed documents send a POST request to localhost:12890/api/inception/v1/tag
. Provide a JSON with the documents in the body of the request as specified above.
An appropriate model
will be automatically chosen for tagging from request json
. If the determined model is not found this will build and train a new model otherwise re-train the existing model.
Notes :
- To start training this system needs at least 30 new documents. Can be configured in
config.py
usingtraining_config.training_document_threshold
key. - No no need to re-send previously sent but documents that did not took part in training.
To tag a document with the default model send a POST request to localhost:12889/api/external/v1/tag
. Provide a JSON with the document in the body of the request as specicified above, but without the case
and user
fields.
To tag a document using a specific model send the POST request to localhost:12889/api/external/v1/tag/<modelName>
. <modelName>
is the name of the model without the .h5
extension.
To receive feedback regarding argument structure (using the default model - medicine) and the content of a text (default content currently medicine), send a POST request to localhost:12889/api/external/v1/feedback
. Provide a JSON with the document in the body of the request as specicified above.
To receive feedback regarding argument structure using a specific model and the content of a text (default content currently medicine), send a POST request to localhost:12889/api/external/v1/feedback/<modelName>
. Provide a JSON with the document in the body of the request as specicified above.
client_inception.py
- An example of expected inception tagging request.client_inception_train.py
- An example of expected inception training request.client_external_partner_feedback.py
- An example of expected tagging request from external partners to provide them with feed back.client_external_partner_tagging.py
- An example of expected external partner's tagging request.IOB_to_CAS_file_generartor.py
- An example of transforming IOB tagged files into CAS XMI. Some example of IOB files used are form Neural-Web Training Experiment project.