To run this project, you will need to add the following environment variables to
your .env
file:
-
Label Studio configs:
LABEL_STUDIO_URL
: URL of the Label Studio instance. E.g.:http://label-studio:8080
.LABEL_STUDIO_LEGACY_TOKEN
: Legacy token for Label Studio API. You can generate it in the Label Studio settings page.
E.g:
# .env
LABEL_STUDIO_URL=http://label-studio:8080
LABEL_STUDIO_LEGACY_TOKEN=eyJhb***
You can also check out the file .env.example
to see all required environment
variables.
- Docker and Docker Compose installed on your machine.
Clone the project:
git clone https://github.com/v-bible/nlp-label-studio.git
Go to the project directory:
cd nlp-label-studio
Start the services:
docker-compose up -d
Access Label Studio in your browser:
Note
You may have to wait a few minutes for the services to start.
http://localhost:8080
Note
Please refer to v-bible/nlp for more information
label | category | examples | vietnamese examples |
---|---|---|---|
PER | person | Jesus, Mary, Peter, Paul, John,... | Giêsu, Maria, Phêrô, Phaolô, Gioan,... |
LOC | location | Jerusalem, Rome, Bethlehem,... | Giêrusalem, Rôma, Bêlem,... |
ORG | organization | Vatican, Catholic Church,... | Vatican, Giáo Hội Công Giáo,... |
TITLE | title | Pope, Bishop, Cardinal,... | Giáo hoàng, Giám mục, Hồng y,... |
TME | time | Sunday, Monday, January,... | Chúa Nhật, Thứ Hai, Tháng Giêng,... |
NUM | number | 1, 2, 3, 4, 5,... | 1, 2, 3, 4, 5,... |
In docker-compose.yaml
, by default the credentials are set to:
LABEL_STUDIO_USERNAME=example@gmail.com
LABEL_STUDIO_PASSWORD=admin
To get Label Studio Legacy API token, go to
http://localhost:8080/organization
> API Tokens Settings
> Check Legacy Tokens
> Save
.
For NER tasks, you can use the
template provided by Label
Studio. You can customize the interface by using the ner-config.xml
.
<!-- ner-config.xml -->
<View style="display:flex; gap:16px;">
<!-- Main Text Area -->
<View style="flex:1;">
<Text name="text" value="$text"/>
</View>
<!-- Metadata + Label Panel -->
<View style="position:sticky; top:10px; align-self:flex-start; max-width:250px; width:100%; padding:8px; border:1px solid #ccc; border-radius:4px;">
<!-- Label Panel -->
<Labels name="label" toName="text" showInline="false">
<Label value="PER" background="#e6194b"/>
<Label value="LOC" background="#3cb44b"/>
<Label value="ORG" background="#4363d8"/>
<Label value="TITLE" background="#f58231"/>
<Label value="TME" background="#911eb4"/>
<Label value="NUM" background="#ffe119"/>
</Labels>
<!-- Metadata Section -->
<View style="font-size:12px; margin-top:12px;">
<Header value="Metadata" size="4"/>
<Header value="Document ID: $documentId" size="5"/>
<Header value="Sentence ID: $sentenceId" size="5"/>
<Header value="Sentence Type: $sentenceType" size="5"/>
<Header value="Language Code: $languageCode" size="5"/>
<Header value="Title: $title" size="5"/>
<Header value="Genre: $genreCode" size="5"/>
</View>
</View>
</View>
To connect to the Machine Learning backend, you can use your local IP address:
http://<your-local-ip>:9090
or huggingface_ner
service name in the Docker Compose network:
http://huggingface_ner:9090
Studio data is mounted to studio-data
directory.
Machine Learning backend data is mounted to ml-backend-data
directory.
Note
As this is a non-root container, the mounted files and directories must have
the proper permissions for the UID 1001
.
Contributions are always welcome!
Please read the contribution guidelines.
Please read the Code of Conduct.
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License.
See the LICENSE.md file for full details.
Duong Vinh - @duckymomo20012 - tienvinh.duong4@gmail.com
Project Link: https://github.com/v-bible/nlp-label-studio.