deeppavlov · nstsj · Jul 5, 2023 · Jul 8, 2023 · Jul 8, 2023 · Jul 5, 2023
diff --git a/annotators/BadlistedWordsDetector/README.md b/annotators/BadlistedWordsDetector/README.md
@@ -4,8 +4,16 @@
 Spacy-based user utterance annotator that detects words and phrases from the badlist
 
 ## I/O
-input:  "sentences": ["fucking hell", "he mishit the shot", "you asshole"],
-output: words and their tags
-[{"bad_words": True}, {"bad_words": False}, {"bad_words": True}]
+**Input:** a list of user's utterances
+```
+["fucking hell", "he mishit the shot", "you asshole"]
+```
+
+**Output:** words and their tags
+```
+ [{"bad_words": True}, {"bad_words": False}, {"bad_words": True}]
+```
+
 
 ## Dependencies
+none
diff --git a/annotators/BadlistedWordsDetector_ru/README.md b/annotators/BadlistedWordsDetector_ru/README.md
@@ -1,11 +1,23 @@
-# BadlistedWordsDetector
+# BadlistedWordsDetector for Russian
 
 ## Description
 
-Spacy-based user utterance annotator that detects words and phrases from the badlist. This version of the annotator works for the Russian Language.
+Spacy-based user utterance annotator that detects words and phrases from the badlist.
+
+This version of the annotator works for the Russian Language.
 
 ## I/O
-input: user input as a str, lang = ru
-output: json dict 
+**Input:**
+Takes a list of user's utterances
+```
+["не пизди.", "застрахуйте уже его", "пошел нахер!"]
+```
+
+**Output:**
+Returns words and their tags
+```
+[{"bad_words": True}, {"bad_words": False}, {"bad_words": True}]
+```
 
 ## Dependencies
+none
diff --git a/annotators/COMeT/README.md b/annotators/COMeT/README.md
@@ -5,6 +5,7 @@
 COMeT is a Commonsense Transformers for Automatic Knowledge Graph Construction service based
 on [comet-commonsense](https://github.com/atcbosselut/comet-commonsense) framework written in Python 3.
 
+
 ### Quickstart from docker for COMeT with Atomic graph
 
 ```bash
@@ -36,6 +37,107 @@ docker-compose -f docker-compose.yml -f local.yml exec comet-conceptnet bash tes
 | Average starting time          | 4s      | 3s         |
 | Average request execution time | 0.4s    | 0.2s       |
 
+## Input/Output
+
+**Input**
+- hypotheses: possible assistant's replies
+- currentUtterance: latest reply from a user
+- pastResponses: a list of user's utterances
+
+an input example ():
+```
+{
+  "input": "PersonX went to a mall",
+  "category": [
+    "xReact",
+    "xNeed",
+    "xAttr",
+    "xWant",
+    "oEffect",
+    "xIntent",
+    "oReact"
+  ]
+}
+
+```
+**Output**
+a list of probabilities about the utterance based on categories:
+- xReact
+- xNeed
+- xAttr
+- xWant
+- oEffect
+- xIntent
+- oReact
+
+an output example ():
+```
+  "xReact": {
+    "beams": [
+      "satisfied",
+      "happy",
+      "excited"
+    ],
+    "effect_type": "xReact",
+    "event": "PersonX went to a mall"
+  },
+  "xNeed": {
+    "beams": [
+      "to drive to the mall",
+      "to get in the car",
+      "to drive to the mall"
+    ],
+    "effect_type": "xNeed",
+    "event": "PersonX went to a mall"
+  },
+  "xAttr": {
+    "beams": [
+      "curious",
+      "fashionable",
+      "interested"
+    ],
+    "effect_type": "xAttr",
+    "event": "PersonX went to a mall"
+  },
+  "xWant": {
+    "beams": [
+      "to buy something",
+      "to go home",
+      "to shop"
+    ],
+    "effect_type": "xWant",
+    "event": "PersonX went to a mall"
+  },
+  "oEffect": {
+    "beams": [
+      "they go to the store",
+      "they go to the mall"
+    ],
+    "effect_type": "oEffect",
+    "event": "PersonX went to a mall"
+  },
+  "xIntent": {
+    "beams": [
+      "to buy something",
+      "to shop",
+      "to buy things"
+    ],
+    "effect_type": "xIntent",
+    "event": "PersonX went to a mall"
+  },
+  "oReact": {
+    "beams": [
+      "happy",
+      "interested"
+    ],
+    "effect_type": "oReact",
+    "event": "PersonX went to a mall"
+  }
+}
+```
+
 
 
 ## Dependencies
+
+none
diff --git a/annotators/ConversationEvaluator/README.md b/annotators/ConversationEvaluator/README.md
@@ -0,0 +1,22 @@
+# Conversation Evaluator
+
+## Description
+This annotator is trained on the Alexa Prize data from the previous competitions and predicts whether the candidate response is interesting, comprehensible, on-topic, engaging, or erroneous.
+
+## Input/Output
+
+**Input**
+- possible assistant's replies
+- user's past responses 
+**Output**
+tags 
+- `isResponseComprehensible`
+- `isResponseErroneous`
+- `isResponseInteresting`
+- `isResponseOnTopic`
+- `responseEngagesUser`
+
+with their probabilities
+
+## Dependencies
+none
diff --git a/annotators/DeepPavlovEmotionClassification/README.md b/annotators/DeepPavlovEmotionClassification/README.md
@@ -1 +1,15 @@
-BERT Base model for emotion classification which learned at the custom dataset(described more precisely in our article) 
+# DeepPavlov Emotion Classification Annotator
+
+## Description
+
+BERT Base model for emotion classification 
+
+## I/O
+
+**Input**
+
+**Output:**
+
+
+## Dependencies
+none
diff --git a/annotators/DeepPavlovFactoidClassification/README.md b/annotators/DeepPavlovFactoidClassification/README.md
@@ -0,0 +1,9 @@
+# Title
+## Description
+
+## Input/Output
+
+**Input**
+**Output**
+
+## Dependencies
diff --git a/annotators/NER/README.md b/annotators/NER/README.md
@@ -0,0 +1,28 @@
+# Title
+Named Entity Recognition Annotator
+
+## Description
+Extracts people names, locations and names of organizations from an uncased text
+
+## Input/Output
+
+**Input**
+A list of user utterances
+```
+["john peterson is my brother.", "he lives in New York."]
+```
+
+
+**Output**
+A user utterance annotated by
+- confidence level
+- named entity's position in a sentence (`start_pos` and `end_pos`)
+- the named the entity itself
+- the named entity type
+
+```
+ [{"confidence": 1, "end_pos": 5, "start_pos": 3, "text": "New York", "type": "LOC"}],
+```
+
+## Dependencies
+none
diff --git a/annotators/NER_deeppavlov/README.md b/annotators/NER_deeppavlov/README.md
@@ -0,0 +1,9 @@
+# Title
+## Description
+
+## Input/Output
+
+**Input**
+**Output**
+
+## Dependencies
diff --git a/annotators/SentRewrite/README.md b/annotators/SentRewrite/README.md
@@ -0,0 +1,9 @@
+# Title
+## Description
+
+## Input/Output
+
+**Input**
+**Output**
+
+## Dependencies
diff --git a/annotators/SentSeg/README.md b/annotators/SentSeg/README.md
@@ -0,0 +1,9 @@
+# Title
+## Description
+
+## Input/Output
+
+**Input**
+**Output**
+
+## Dependencies
diff --git a/annotators/asr/README.md b/annotators/asr/README.md
@@ -5,6 +5,13 @@
 ASR component allows users to provide speech input via its `http://_service_name_:4343/asr?user_id=` endpoint.  To do so, attach the recorded voice as a `.wav` file, 16KHz. 
 
 ## I/O
+**Input:** 
+user utterance: recorded voice as a `.wav` file
+
+**Output** 
+asr_confidence: a probability of a user speech recognition 
 
 
 ## Dependencies
+none
+
diff --git a/annotators/combined_classification/README.md b/annotators/combined_classification/README.md
@@ -21,12 +21,14 @@ The models were trained on the following datasets:
 
 The model also contains 3 replacement models for Amazon services.
 
-The models (multitask and comparative single task) were trained with initial learning rate 2e-5(with validation patience 2 it could be dropped 2 times), batch size 32,optimizer adamW(betas (0.9,0.99) and early stop on 3 epochs. The criteria on early stopping was average accuracy for all tasks for multitask models, or the single-task accuracy for singletask models.
+The models (multitask and comparative single task) were trained with initial learning rate 2e-5(with validation patience 2 it could be dropped 2 times), batch size 32,optimizer adamW(betas (0.9,0.99)) and early stop on 3 epochs. The criteria on early stopping was average accuracy for all tasks for multitask models, or the single-task accuracy for singletask models.
 
 This model(with a distilbert-base-uncased backbone) takes only 2439 Mb for 9 tasks, whereas single-task models with the same backbone for every of these tasks take up almost the same memory(~2437 Mb for every of these 9 tasks).
 
 ## I/O
-text here if i/o specified
+**Input:** immediate user utterances  (+ optional history of previous utterances)
+**Output:** tags for each utterance  (based on toxic/topic/emotion/sentiment/factoid/midas classification)
 
 ## Dependencies
+none
 
diff --git a/annotators/combined_classification_lightweight/README.md b/annotators/combined_classification_lightweight/README.md
@@ -21,13 +21,15 @@ The models were trained on the following datasets:
 
 The model also contains 3 replacement models for Amazon services.
 
-The models (multitask and comparative single task) were trained with initial learning rate 2e-5(with validation patience 2 it could be dropped 2 times), batch size 32,optimizer adamW(betas (0.9,0.99) and early stop on 3 epochs. The criteria on early stopping was average accuracy for all tasks for multitask models, or the single-task accuracy for singletask models.
+The models (multitask and comparative single task) were trained with initial learning rate 2e-5(with validation patience 2 it could be dropped 2 times), batch size 32,optimizer adamW(betas (0.9,0.99)) and early stop on 3 epochs. The criteria on early stopping was average accuracy for all tasks for multitask models, or the single-task accuracy for singletask models.
 
 This model(with a huawei-noah/TinyBERT_General_4L_312D backbone) on a CPU-only inference takes 42% less time than combined_classification, while using only ~1.5 Gb of the CPU instead of the 2909 Mb for combined_classification. The average accuracy and average F1 at the same time are for this model only ~1.5% lower than for the combined_classification, and this dropdown is consistent for all tasks.
 
 
 ## I/O
-
+**Input:** immediate user utterances  (+ optional history of previous utterances)
+**Output:** tags for each utterance  (based on toxic/topic/emotion/sentiment/factoid/midas classification)
 
 ## Dependencies
+none
 
diff --git a/annotators/custom_entity_linking/README.md b/annotators/custom_entity_linking/README.md
@@ -4,29 +4,40 @@
 This component is an Annotator that sematically links entities detected in user utterances. Entites then bound via relations.
 
 Relation examples:
-- favorite animal
-- like animal
-- favorite book
-- like read
-- favorite movie
-- favorite food
-- like food
-- favorite drink
-- like drink
-- favorite sport
-- like sports
+- `favorite animal`
+- `like animal`
+- `favorite book`
+- `like read`
+- `favorite movie`
+- `favorite food`
+- `like food`
+- `favorite drink`
+- `like drink`
+- `favorite sport`
+- `like sports`
 
 
 ## I/O
 
-**Inpunt**
+**Input**
+Takes a list of user_id, entity substring, entity_tags
 
+An input example:
+```
+```
 
-**Output:** 
-the annotator returns:
+**Output:**
+processed information about:
 - entities
 - entity_id (ids for multiple entities)
 - entity_confidence score
 - entity_id_tags
-
+
+An output example:
+```
+```
+
 ## Dependencies
+- annotators.ner
+- annotators.entity_detection
+- annotators.spacy_nounphrases
diff --git a/annotators/dialog_breakdown/README.md b/annotators/dialog_breakdown/README.md
@@ -0,0 +1,9 @@
+# Title
+## Description
+
+## Input/Output
+
+**Input**
+**Output**
+
+## Dependencies