Merge pull request #594 from deeppavlov/dev

Release v1.12.0
deeppavlov · Dec 27, 2023 · 8dab819 · 8dab819
2 parents 3b8f2b7 + b6f6726
commit 8dab819
Show file tree

Hide file tree

Showing 275 changed files with 12,427 additions and 262 deletions.
diff --git a/MODELS.md b/MODELS.md
diff --git a/README.md b/README.md
@@ -72,6 +72,13 @@ and the provided information will be used in LLM-powered reply generation as a p
 
 # Quick Start
 
+### System Requirements
+
+- Operating System: Ubuntu 18.04+, Windows 10+ (через WSL \& WSL2), MacOS Big Sur;
+- Version of `docker` from 20 and above;
+- Version of `docker-compose` v1.29.2;
+- Operative Memory from 2 Gb (using proxy), from 4 Gb (LLM-based prompted distributions) and from  20 Gb (old scripted distributions).
+
 ### Clone the repo
 
 ```

diff --git a/README_ru.md b/README_ru.md
@@ -65,6 +65,15 @@ Deepy GoBot Base содержит аннотатор исправления оп
 
 # Quick Start
 
+
+### System Requirements
+
+- Операционная система Ubuntu 18.04+, Windows 10+ (через WSL \& WSL2), MacOS Big Sur;
+- Версия docker от 20 и выше;
+- Версия docker-compose v1.29.2;
+- Оперативная память от 2 гигабайт (при использовании прокси контейнеров), от 4 гигабайт (при использовании дистрибутивов на основе БЯМ) и от 20 гигабайт (при использовании сценарных дистрибутивов).
+
+
 ### Склонируйте репозиторий
 
 ```
@@ -189,33 +198,35 @@ docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.ove
 
 ## Annotators
 
-| Name                   | Requirements            | Description                                                                                                                                                                                  |
-|------------------------|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| Badlisted Words        | 50 MB RAM               | detects obscene Russian words from the badlist                                                                                                                                               |
-| Entity Detection       | 5.5 GB RAM              | extracts entities and their types from utterances                                                                                                                                            |
-| Entity Linking         | 400 MB RAM              | finds Wikidata entity ids for the entities detected with Entity Detection                                                                                                                    |
-| Fact Retrieval         | 6.5 GiB RAM, 1 GiB GPU  | Аннотатор извлечения параграфов Википедии, релевантных истории диалога.                                                                                                                      |
-| Intent Catcher         | 900 MB RAM              | classifies user utterances into a number of predefined intents which are trained on a set of phrases and regexps                                                                             |
-| NER                    | 1.7 GB RAM, 4.9 GB GPU  | extracts person names, names of locations, organizations from uncased text using ruBert-based (pyTorch) model                                                                                |
-| Sentseg                | 2.4 GB RAM, 4.9 GB GPU  | recovers punctuation using ruBert-based (pyTorch) model and splits into sentences                                                                                                            |
-| Spacy Annotator        | 250 MB RAM              | token-wise annotations by Spacy                                                                                                                                                              |
-| Spelling Preprocessing | 8 GB RAM                | Russian Levenshtein correction model                                                                                                                                                         |
-| Toxic Classification   | 3.5 GB RAM, 3 GB GPU    | Toxic classification model from Transformers specified as PRETRAINED_MODEL_NAME_OR_PATH                                                                                                      |
-| Wiki Parser            | 100 MB RAM              | extracts Wikidata triplets for the entities detected with Entity Linking                                                                                                                     |
-| DialogRPT              | 3.8 GB RAM,  2 GB GPU   | DialogRPT model which is based on [Russian DialoGPT by DeepPavlov](https://huggingface.co/DeepPavlov/rudialogpt3_medium_based_on_gpt2_v2) and fine-tuned on Russian Pikabu Comment sequences |
+| Name                       | Requirements           | Description                                                                                                                                                                                  |
+|----------------------------|------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Badlisted Words            | 50 MB RAM              | detects obscene Russian words from the badlist                                                                                                                                               |
+| Entity Detection           | 5.5 GB RAM             | extracts entities and their types from utterances                                                                                                                                            |
+| Entity Linking             | 400 MB RAM             | finds Wikidata entity ids for the entities detected with Entity Detection                                                                                                                    |
+| Fact Retrieval             | 6.5 GiB RAM, 1 GiB GPU | Аннотатор извлечения параграфов Википедии, релевантных истории диалога.                                                                                                                      |
+| Intent Catcher             | 900 MB RAM             | classifies user utterances into a number of predefined intents which are trained on a set of phrases and regexps                                                                             |
+| NER                        | 1.7 GB RAM, 4.9 GB GPU | extracts person names, names of locations, organizations from uncased text using ruBert-based (pyTorch) model                                                                                |
+| Relative Persona Extractor | 50 MB RAM              | Annotator utilizing Sentence Ranker to rank persona sentences and selecting `N_SENTENCES_TO_RETURN` the most relevant sentences                                                              |
+| Sentseg                    | 2.4 GB RAM, 4.9 GB GPU | recovers punctuation using ruBert-based (pyTorch) model and splits into sentences                                                                                                            |
+| Spacy Annotator            | 250 MB RAM             | token-wise annotations by Spacy                                                                                                                                                              |
+| Spelling Preprocessing     | 8 GB RAM               | Russian Levenshtein correction model                                                                                                                                                         |
+| Toxic Classification       | 3.5 GB RAM, 3 GB GPU   | Toxic classification model from Transformers specified as PRETRAINED_MODEL_NAME_OR_PATH                                                                                                      |
+| Wiki Parser                | 100 MB RAM             | extracts Wikidata triplets for the entities detected with Entity Linking                                                                                                                     |
+| DialogRPT                  | 3.8 GB RAM,  2 GB GPU  | DialogRPT model which is based on [Russian DialoGPT by DeepPavlov](https://huggingface.co/DeepPavlov/rudialogpt3_medium_based_on_gpt2_v2) and fine-tuned on Russian Pikabu Comment sequences |
 
 ## Skills & Services
-| Name                 | Requirements             | Description                                                                                                                         |
-|----------------------|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------|
-| DialoGPT             | 2.8 GB RAM, 2 GB GPU     | [Russian DialoGPT by DeepPavlov](https://huggingface.co/DeepPavlov/rudialogpt3_medium_based_on_gpt2_v2)                             |
-| Dummy Skill          |                          | a fallback skill with multiple non-toxic candidate responses and random Russian questions                                           |
-| Personal Info Skill  | 40 MB RAM                | queries and stores user's name, birthplace, and location                                                                            |
-| DFF Generative Skill | 50 MB RAM                | **[New DFF version]** generative skill which uses DialoGPT service to generate 3 different hypotheses                               |
-| DFF Intent Responder | 50 MB RAM                | provides template-based replies for some of the intents detected by Intent Catcher annotator                                        |
-| DFF Program Y Skill  | 80 MB RAM                | **[New DFF version]** Chatbot Program Y (https://github.com/keiffster/program-y) adapted for Dream socialbot                        |
-| DFF Friendship Skill | 70 MB RAM                | **[New DFF version]** DFF-based skill to greet the user in the beginning of the dialog, and forward the user to some scripted skill |
-| DFF Template Skill   | 50 MB RAM                | **[New DFF version]** DFF-based skill that provides an example of DFF usage                                                         |
-| Text QA              | 3.8 GiB RAM, 5.2 GiB GPU | Навык для ответа на вопросы по тексту.                                                                                              |
+| Name                  | Requirements             | Description                                                                                                                                                                                       |
+|-----------------------|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| DialoGPT              | 2.8 GB RAM, 2 GB GPU     | [Russian DialoGPT by DeepPavlov](https://huggingface.co/DeepPavlov/rudialogpt3_medium_based_on_gpt2_v2)                                                                                           |
+| Dummy Skill           |                          | a fallback skill with multiple non-toxic candidate responses and random Russian questions                                                                                                         |
+| Personal Info Skill   | 40 MB RAM                | queries and stores user's name, birthplace, and location                                                                                                                                          |
+| DFF Generative Skill  | 50 MB RAM                | **[New DFF version]** generative skill which uses DialoGPT service to generate 3 different hypotheses                                                                                             |
+| DFF Intent Responder  | 50 MB RAM                | provides template-based replies for some of the intents detected by Intent Catcher annotator                                                                                                      |
+| DFF Program Y Skill   | 80 MB RAM                | **[New DFF version]** Chatbot Program Y (https://github.com/keiffster/program-y) adapted for Dream socialbot                                                                                      |
+| DFF Friendship Skill  | 70 MB RAM                | **[New DFF version]** DFF-based skill to greet the user in the beginning of the dialog, and forward the user to some scripted skill                                                               |
+| DFF Template Skill    | 50 MB RAM                | **[New DFF version]** DFF-based skill that provides an example of DFF usage                                                                                                                       |
+| Seq2seq Persona-based | 1.5 GB RAM, 1.5 GB GPU   | generative service based on Transformers seq2seq model, the model was pre-trained on the PersonaChat dataset to generate a response conditioned on a several sentences of the socialbot's persona |
+| Text QA               | 3.8 GiB RAM, 5.2 GiB GPU | Навык для ответа на вопросы по тексту.                                                                                                                                                            |
 
 
 

diff --git a/annotators/BadlistedWordsDetector/README.md b/annotators/BadlistedWordsDetector/README.md
@@ -4,8 +4,16 @@
 Spacy-based user utterance annotator that detects words and phrases from the badlist
 
 ## I/O
-input:  "sentences": ["fucking hell", "he mishit the shot", "you asshole"],
-output: words and their tags
-[{"bad_words": True}, {"bad_words": False}, {"bad_words": True}]
+**Input:** a list of user's utterances
+```
+["fucking hell", "he mishit the shot", "you asshole"]
+```
+
+**Output:** words and their tags
+```
+ [{"bad_words": True}, {"bad_words": False}, {"bad_words": True}]
+```
+
 
 ## Dependencies
+none
diff --git a/annotators/BadlistedWordsDetector_ru/README.md b/annotators/BadlistedWordsDetector_ru/README.md
@@ -1,11 +1,23 @@
-# BadlistedWordsDetector
+# BadlistedWordsDetector for Russian
 
 ## Description
 
-Spacy-based user utterance annotator that detects words and phrases from the badlist. This version of the annotator works for the Russian Language.
+Spacy-based user utterance annotator that detects words and phrases from the badlist.
+
+This version of the annotator works for the Russian Language.
 
 ## I/O
-input: user input as a str, lang = ru
-output: json dict 
+**Input:**
+Takes a list of user's utterances
+```
+["не пизди.", "застрахуйте уже его", "пошел нахер!"]
+```
+
+**Output:**
+Returns words and their tags
+```
+[{"bad_words": True}, {"bad_words": False}, {"bad_words": True}]
+```
 
 ## Dependencies
+none
diff --git a/annotators/COMeT/README.md b/annotators/COMeT/README.md
@@ -5,6 +5,7 @@
 COMeT is a Commonsense Transformers for Automatic Knowledge Graph Construction service based
 on [comet-commonsense](https://github.com/atcbosselut/comet-commonsense) framework written in Python 3.
 
+
 ### Quickstart from docker for COMeT with Atomic graph
 
 ```bash
@@ -36,6 +37,107 @@ docker-compose -f docker-compose.yml -f local.yml exec comet-conceptnet bash tes
 | Average starting time          | 4s      | 3s         |
 | Average request execution time | 0.4s    | 0.2s       |
 
+## Input/Output
+
+**Input**
+- hypotheses: possible assistant's replies
+- currentUtterance: latest reply from a user
+- pastResponses: a list of user's utterances
+
+an input example ():
+```
+{
+  "input": "PersonX went to a mall",
+  "category": [
+    "xReact",
+    "xNeed",
+    "xAttr",
+    "xWant",
+    "oEffect",
+    "xIntent",
+    "oReact"
+  ]
+}
+
+```
+**Output**
+a list of probabilities about the utterance based on categories:
+- xReact
+- xNeed
+- xAttr
+- xWant
+- oEffect
+- xIntent
+- oReact
+
+an output example ():
+```
+  "xReact": {
+    "beams": [
+      "satisfied",
+      "happy",
+      "excited"
+    ],
+    "effect_type": "xReact",
+    "event": "PersonX went to a mall"
+  },
+  "xNeed": {
+    "beams": [
+      "to drive to the mall",
+      "to get in the car",
+      "to drive to the mall"
+    ],
+    "effect_type": "xNeed",
+    "event": "PersonX went to a mall"
+  },
+  "xAttr": {
+    "beams": [
+      "curious",
+      "fashionable",
+      "interested"
+    ],
+    "effect_type": "xAttr",
+    "event": "PersonX went to a mall"
+  },
+  "xWant": {
+    "beams": [
+      "to buy something",
+      "to go home",
+      "to shop"
+    ],
+    "effect_type": "xWant",
+    "event": "PersonX went to a mall"
+  },
+  "oEffect": {
+    "beams": [
+      "they go to the store",
+      "they go to the mall"
+    ],
+    "effect_type": "oEffect",
+    "event": "PersonX went to a mall"
+  },
+  "xIntent": {
+    "beams": [
+      "to buy something",
+      "to shop",
+      "to buy things"
+    ],
+    "effect_type": "xIntent",
+    "event": "PersonX went to a mall"
+  },
+  "oReact": {
+    "beams": [
+      "happy",
+      "interested"
+    ],
+    "effect_type": "oReact",
+    "event": "PersonX went to a mall"
+  }
+}
+```
+
 
 
 ## Dependencies
+
+none
diff --git a/annotators/ConversationEvaluator/README.md b/annotators/ConversationEvaluator/README.md
@@ -0,0 +1,22 @@
+# Conversation Evaluator
+
+## Description
+This annotator is trained on the Alexa Prize data from the previous competitions and predicts whether the candidate response is interesting, comprehensible, on-topic, engaging, or erroneous.
+
+## Input/Output
+
+**Input**
+- possible assistant's replies
+- user's past responses 
+**Output**
+tags 
+- `isResponseComprehensible`
+- `isResponseErroneous`
+- `isResponseInteresting`
+- `isResponseOnTopic`
+- `responseEngagesUser`
+
+with their probabilities
+
+## Dependencies
+none
diff --git a/annotators/DeepPavlovEmotionClassification/README.md b/annotators/DeepPavlovEmotionClassification/README.md
@@ -1 +1,15 @@
-BERT Base model for emotion classification which learned at the custom dataset(described more precisely in our article) 
+# DeepPavlov Emotion Classification Annotator
+
+## Description
+
+BERT Base model for emotion classification 
+
+## I/O
+
+**Input**
+
+**Output:**
+
+
+## Dependencies
+none
diff --git a/annotators/DeepPavlovFactoidClassification/README.md b/annotators/DeepPavlovFactoidClassification/README.md
@@ -0,0 +1,9 @@
+# Title
+## Description
+
+## Input/Output
+
+**Input**
+**Output**
+
+## Dependencies
diff --git a/annotators/IntentCatcherTransformers/README.md b/annotators/IntentCatcherTransformers/README.md
@@ -1,5 +1,7 @@
-## IntentCatcher based on Transformers
+## Intent Catcher based on Transformers
 
+Intent Catcher Annotator allows to adapt the dialog system to particular tasks. 
+The annotator detects intents of the user that are addressed by the DFF Intent Responder Skill.
 
 English version was trained on `intent_phrases.json` dataset using `DeepPavlov` library via command:
 ```

diff --git a/annotators/NER/README.md b/annotators/NER/README.md
@@ -0,0 +1,28 @@
+# Title
+Named Entity Recognition Annotator
+
+## Description
+Extracts people names, locations and names of organizations from an uncased text
+
+## Input/Output
+
+**Input**
+A list of user utterances
+```
+["john peterson is my brother.", "he lives in New York."]
+```
+
+
+**Output**
+A user utterance annotated by
+- confidence level
+- named entity's position in a sentence (`start_pos` and `end_pos`)
+- the named the entity itself
+- the named entity type
+
+```
+ [{"confidence": 1, "end_pos": 5, "start_pos": 3, "text": "New York", "type": "LOC"}],
+```
+
+## Dependencies
+none
diff --git a/annotators/NER_deeppavlov/README.md b/annotators/NER_deeppavlov/README.md
@@ -0,0 +1,9 @@
+# Title
+## Description
+
+## Input/Output
+
+**Input**
+**Output**
+
+## Dependencies
diff --git a/annotators/SentRewrite/README.md b/annotators/SentRewrite/README.md
@@ -0,0 +1,9 @@
+# Title
+## Description
+
+## Input/Output
+
+**Input**
+**Output**
+
+## Dependencies
diff --git a/annotators/SentSeg/README.md b/annotators/SentSeg/README.md
@@ -0,0 +1,9 @@
+# Title
+## Description
+
+## Input/Output
+
+**Input**
+**Output**
+
+## Dependencies
diff --git a/annotators/asr/README.md b/annotators/asr/README.md
@@ -5,6 +5,13 @@
 ASR component allows users to provide speech input via its `http://_service_name_:4343/asr?user_id=` endpoint.  To do so, attach the recorded voice as a `.wav` file, 16KHz. 
 
 ## I/O
+**Input:** 
+user utterance: recorded voice as a `.wav` file
+
+**Output** 
+asr_confidence: a probability of a user speech recognition 
 
 
 ## Dependencies
+none
+