Skip to content

UKPLab/emnlp2019-NeuralWeb

Repository files navigation

Neural Web

This API provides

  1. tagging with Neural Networks for INCEpTION (as external recommender)
  2. automatic feedback generation for CASUS using tagging with Neural Networks for EDA feedback and string-matching for content feedback

Currently the Neural Network used is a BiLSTM-CRF by Nils Reimers implemented in Keras and using the Theano backend.

However, any other network can be used by creating a class <yournetwork>Wrapper that inherits from your network and the abstract class neuralNets/network.py. Use neuralNets/BiLSTMWrapper.py as an example.


API for INCEpTION

neural-web receives a JSON for tagging that looks like this:

{
      "metadata": {
        "layer": "LayerName_String",
        "feature": "FatureOfTheLayer_String",
        "projectId": "InceptionProjectID_Integer",
        "anchoringMode": "InceptionAchoringMode_String",
        "crossSentence": "IsCrossSentenceTokenAllowed_Boolean"
      },
      "document": {
        "xmi": "EscapedPlainTextXMI_XMLString",
        "documentId": "InceptionDocumentID_Integer",
        "userId": "InceptionUserID_String"
      },
      "typeSystem": "EscapedPlainTextXMI_XMLString"
}

and for training JSON is very similar but looks like this:

{
      "metadata": {
        "layer": "LayerName_String",
        "feature": "FatureOfTheLayer_String",
        "projectId": "InceptionProjectID_Integer",
        "anchoringMode": "InceptionAchoringMode_String",
        "crossSentence": "IsCrossSentenceTokenAllowed_Boolean"
      },
      "documents": [
          {
            "xmi": "EscapedPlainTextXMI_XMLString",
            "documentId": "InceptionDocumentID_Integer",
            "userId": "InceptionUserID_String"
          }
      ],
      "typeSystem": "EscapedPlainTextXMI_XMLString"
}

API for CASUS

neural-web receives a JSON for tagging that looks like this:

{
    "text": "<BASE64Encoded_UTF8String>",
    "user":"<String_IDofUser>",
    "case": "<String_numberOfCase>",
    "api-key": "<givenToClientByUs>",
    "proof": "<blake2bEncodingOfText>"
}

No training facility for CASUS implemented yet.

Setup

Setup nltk and punkt

This needs to be done only once and may be ignored if nltk is already used previously with the system.

To install NLTK:

sudo pip3 install -U nltk

Once nltk is installed open terminal and go to the python3 shell, then use:

import nltk
nltk.download('punkt')
nltk.download('stopwords')

from nltk import word_tokenize,sent_tokenize

The last import command verifies if everything is installed correctly.


Setup with virtual environment

Setup a Python virtual environment:

virtualenv env --python=python3.5
source env/bin/activate

Install the requirements:

python -m pip install -r requirements.txt

If everything works well, you can run python app.py ${config} to start the Server, ${config} can be filled with values dev or prod


Build and run docker image

Right now neural web has two versions hosted on server. One for production and the other for development, using ports 12889 and 12890 respectively.

Building and hosting production server

Build the docker image from the root folder of the repository. Run:

sudo docker build ./docker/ -f ./docker/Dockerfile -t neural-web-prod

This builds a Python 3.5 image with additional dependencies form docker/requirements.txt installed and assigns the name neural-web-prod to it.

To run our code, we first must create a container form the previously created image and start the container and mount the current folder ${PWD} into the container:

  • docker run ... command first creates a container using docker create and then starts it using docker start
sudo docker run --restart on-failure -d -it -v ${PWD}:/usr/src/app -p 12889:12889 -e host_config=prod neural-web-prod

The command -v ${PWD}:/usr/src/app maps the current folder ${PWD} into the docker container at the position /user/src/app. Changes made on the host system as well as in the container are synchronized. We can change / add / delete files in the current folder and its subfolder and can access those files directly in the docker container.

Here -p host-port:conatiner-port. Please select these two ports according to your util/config.py

app.py will be executed automatically when the container has started.

Building and hosting dev server

All the previous constraints also apply to dev server.

To Build the image run:

sudo docker build ./docker/ -f ./docker/Dockerfile -t neural-web-dev

For creating and hosting use:

sudo docker run --restart on-failure -d -it -v ${PWD}:/usr/src/app -p 12890:12890 -e host_config=dev neural-web-dev

Notes on rebuilding and/or re-hosting

To rebuild and re-host:

  • This should be necessary only when we have changed something in requirements.txt and/or Dockerfile
  1. Find the desired container using docker ps -a; will print a list of all running and stopped container.
  2. Use docker stop <param>; <param> might be either the CONTAINER ID or NAME
  3. Then remove the container with docker rm <param>; again <param> might be either the CONTAINER ID or NAME
  4. Then remove the associated image using docker rmi <param>; now this will be image name i.e. neural-web-dev or neural-web-prod
  5. Change source files (i.e. editing, pulling etc.)
  6. Build the image using method stated above; use docker build ... .
  7. Create and run the conatiner with changed source as per above using docker run ...

To only re-host:

  • This should be sufficient only for source code modification
  1. Find the desired container using docker ps -a; will print a list of all running and stopped container.
  2. Use docker stop <param>; <param> might be either the CONTAINER ID or NAME
  3. Change source files (i.e. editing, pulling etc.)
  4. Start the container using docker start <param>; again <param> might be either the CONTAINER ID or NAME

CONTAINER NAME is given automatically by docker when a new container is created and multiple container can be created using same image


Add trained Neural Network models

Paste your model(s) in the folder models.

API Functionality

Load a model

By defeault, all models in the models folder are loaded upon starting the Server. To load a model that was not in the folder, add it to the folder and send a POST request to localhost:12889/loadModel/<modelName> and wait for the response. <modelName> is the name of the model without the .h5 extension.


Tag a Document sent as CAS-XMI (for INCEpTION)

To tag a document with the default model send a POST request to localhost:12890/api/inception/v1/tag. Provide a JSON with the document in the body of the request as specified above. An appropriate model will be automatically chosen for tagging from request json. If the determined model is not found will reply with a HTTP 404 response.


Train a model from Documents sent as CAS-XMI (for INCEpTION)

To train a model using some previously tagged and completed documents send a POST request to localhost:12890/api/inception/v1/tag. Provide a JSON with the documents in the body of the request as specified above. An appropriate model will be automatically chosen for tagging from request json. If the determined model is not found this will build and train a new model otherwise re-train the existing model.

Notes :

  • To start training this system needs at least 30 new documents. Can be configured in config.py using training_config.training_document_threshold key.
  • No no need to re-send previously sent but documents that did not took part in training.

Tag a Document sent as String (for CASUS or other applications)

To tag a document with the default model send a POST request to localhost:12889/api/external/v1/tag. Provide a JSON with the document in the body of the request as specicified above, but without the case and user fields. To tag a document using a specific model send the POST request to localhost:12889/api/external/v1/tag/<modelName>. <modelName> is the name of the model without the .h5 extension.


Get automatic feedback for a Document sent as String (for CASUS)

To receive feedback regarding argument structure (using the default model - medicine) and the content of a text (default content currently medicine), send a POST request to localhost:12889/api/external/v1/feedback. Provide a JSON with the document in the body of the request as specicified above. To receive feedback regarding argument structure using a specific model and the content of a text (default content currently medicine), send a POST request to localhost:12889/api/external/v1/feedback/<modelName>. Provide a JSON with the document in the body of the request as specicified above.

Example scripts for minimal clients

  • client_inception.py - An example of expected inception tagging request.
  • client_inception_train.py - An example of expected inception training request.
  • client_external_partner_feedback.py - An example of expected tagging request from external partners to provide them with feed back.
  • client_external_partner_tagging.py - An example of expected external partner's tagging request.
  • IOB_to_CAS_file_generartor.py - An example of transforming IOB tagged files into CAS XMI. Some example of IOB files used are form Neural-Web Training Experiment project.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages