Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Схема архитектуры шакальная #110

Open
oserikov opened this issue Feb 23, 2022 · 0 comments
Open

Схема архитектуры шакальная #110

oserikov opened this issue Feb 23, 2022 · 0 comments

Comments

@oserikov
Copy link
Contributor

https://github.com/deepmipt/dream/blob/main/DREAM.png

IgnatovFedor pushed a commit that referenced this issue Dec 6, 2023
* formatting: first commit

* unify summary descriptions

* formatting for titles completed

* fix compose_variables; fix getting parts of report; fix summary length prompts; fix formatting

* fix: verify=False for getting files

* improve some prompts

* working formatting

* codestyle

* add comments

* formatting fixes

* sent most of logic to utils

* codestyle
IgnatovFedor added a commit that referenced this issue Dec 6, 2023
* feat: Azure OpenAI

* fix: black

* refactor: changed davinci3 to 2

* fix: tests

* refactor: added missing newline

* refactor: code formatting

* fix: use .env_secret_azure for additional env vars for azure

* fix: use .env_azure for public services

* feat: azure api variables

* fix: use .env_azure for public services

* feat: created .env_secret_azure

* fix: use .env_azure for management assistants

* Feat/doc skills turnon logic to common (#94)

* move doc skills logic to common; introduce it to desc based skill selector

* turn on doc-based skills if we have doc in use for desc based skill selector; complex checks for llm based skill selector

* remove dff_meeting_analysis_skill from automatically added skills

* add comment about turning on doc based skills

* add doc-skill turn on logic to universal llm-based skill selector; also fix the issue with activating all skills from pipeline if there is an exception

* codestyle

* remove extra list(set())

* fixes acc to Dilya

* fix: skill selection logic with docs also

* fix: codestyle

* codestyle

* remove N_TURNS_TO_KEEP_DOC from skill selector

---------

Co-authored-by: dilyararimovna <dilyara.rimovna@gmail.com>

* Feat/weekly with separate files (#99)

* feat: management distribution

* fix: prompt selector for roles

* first commit for meeting analysis

* working distribution, but no meeting analysis yet

* prototype files

* prompts

* dff_meeting_analysis_skill instead of prompted; llm-based everything

* working version of meeting analysis skill

* dff_meeting_analysis_skill with 4 nodes

* doc-processor annotator

* added saving previous meeting analysis results; links to them are  written to bot attrs

* update roles

* fix for meeting analysis skill, now working

* document only for now, then will be deleted

* prompt for unabridged response selection

* refactor doc_processor, remove unnecessary funcs

* better prompts

* better skill description in components

* add llm-based-skill-selector to dist

* enable finding previously generated meeting analyses; better fallback

* 512 max_tokens for chatgpt in some cases

* enhance response selector prompt

* add dff_meeting_analysis_skill_formatter

* some fixes to cards and configs

* update readmes

* correct ports for doc processor; remove extra prompt

* codestyle

* codestyle

* fixes for Dilya

* enhanced checks

* typo

* codestyle + small fix for checks

* file moved to google drive

* remove extra print

* checking each file if processed; concatenating multiple files; two containers for doc-processor

* typo fix

* unique ids for files in data/, ids to paths in config

* delete transcript files

* codestyle

* fix: UIDs for files in data now working

* fixes in working with files

* codestyle

* fix error in getting related_files

* Revert "fix error in getting related_files"

This reverts commit 705e23897e9317e1ba24702b14e7c097da093dcd.

* working fix for bot_attrs_files

* remove document file

* numerous fixes for review

* codestyle

* bring some things to common

* even better funcs in common

* codestyle

* saving all processed docs in atts; saving candidate texts in adds of utt; link or path possible for processing from atts

* fixes for accidentally broken stuff

* some more fixes

* candidate texts to hyp attributes

* codestyle

* FILE_SERVER_TIMEOUT as arg

* GENERATIVE_SERVICE_URL as arg

* fix: formatters in pipeline_conf

* component card for vectorize_documents

* openai-chatgpt-long.json for document-qa-llm-skill

* openai-chatgpt-long.json for meeting-analysis-skill

* fix: timeouts and component card paths

* add regex for http check

* doc processor names in service_config files

* update getting envvars

* codestyle

* fix: remove envvars from everywhere

* fix: remove envvars from everywhere

* fixes: details in cards and pipeline

* fixes: details in cards and pipeline

* feat: special message if failed to process file from atts

* get token limit from service endpoint

* fix: better upload_document, try except inside func & enable both text and file upload in one func

* docstrings; also fix: detecting extension for links

* codestyle

* again codestyle

* update READMEs with dialog state info

* fix: add diff endpoints to doc-retriever readme

* fix: solve inconsistencies in cards and readmes

* fix: incorrect formatters in cards

* update ports to non-allocated ones

* fixes: everything acc to comments

* codestyle

* generalize file service url in another comment

* codestyle

* refactor attributes structure

* update readmes to include info about new attributes format

* fix: clean config; comment about format

* add comments; {FILE_SERVER_URL} instead of actual url

* comments and readmes

* implement storing doc for N_TURNS_FOR_DISCUSSION turns

* codestyle

* improve N_TURNS_FOR_DISCUSSION, implement only for doc-processor-from-atts

* better logging in doc-retriever

* codestyle

* more comments

* codestyle

* delete extra logs

* some more comments

* count n_steps_discussed in any case; put that to readme

* fix: n_steps_discussed in correct place

* fix: if file was processed earlier, take processed text from processed_documents

* if we get doc from somewhere, consider it good as new -> reset n_steps_discussed to 0

* codestyle

* update comments; fix logic of n_steps_discussed

* better comments

* fix: small fixes

* N_TURNS_FOR_DISCUSSION: -> N_TURNS_TO_KEEP_DOC

* N_TURNS_TO_KEEP_DOC in distribution files

* N_TURNS_TO_KEEP_DOC: 10 ->; also updates in readmes and comments

* codestyle

* comment about N_TURNS_TO_KEEP_DOC

* comment about N_TURNS_TO_KEEP_DOC

* fix: remove sentseg from management dist

* better descriptions for skills

* fix hyp format for dff_meeting_analysis_skill

* fixes: remove logs, improve skill description

* ensure unique ids everywhere; add dialog_id to file_id

* update skill selector: turn off doc-based skills when we don't have doc

* codestyle

* codestyle again

* remove one extra log

* now we can also process files from file server

* codestyle

* fix: is_container_running to response.py

* fix to prompts; also longer context for many services

* always turn on document-based wa skill

* codestyle

* add file exists check

* start adding question_answering default node

* node for question answering in meeting analysis skill; small change in llm-based-skill-selector

* codestyle

* condition file

* Dilya's fixes for skill-selector

* codestyle

* slightly improve prompt for response selector

* fix: chunks only split by newlines

* fix: no extra info in prompts; better response selector

* small fixes

* codestyle

* added list title

* codestyle

* codestyle

* moved is_container_running up

* fix: tags: selector

* add check if skill to add is in pipeline

* shorter prompt for response selector

* copy older dist with tf-idf qa as management_assistant_extended

* remove tf-idf qa skill from management assistant

* update description for meeting analysis skill

* remover doc-retriever from main distribution

* better guidance for qa

* feat: turn on dff_meeting_analysis_skill when it was used with the same doc before

* codestyle

* codestyle

* fix: only perform doc-related checks in skill selector if we actually have a doc in use

* fix: include situation when we don't have prev_skills or prev_docs in skill selector

* use gpt4 for meeting analysis skill

* feat: add progress by areas

* improve prompts

* gpt-4 response selector

* feat: weekly reports, draft

* improved prompts for showing titles

* huge timeout

* add re.DOTALL flag

* fix: regex for conditions

* now working with separate files in use

* update attributes format (for docs_in_use)

* update test files for new attribute format

* codestyle

* update annotator readmes

* update skill for new attributes format

* improve comments

* switch to chatgpt

* fixing conflicts from merge

* fix things lost during merge

* codestyle

* add some more accidentally lost info

* return accidentally lost change

* changes for Dilya

* filetype exception - remove logging

* remove sentry from utils.py

* flake8 improve work with exception; update info about meeting skill in extended dist

* update envvars

* remove unnecessary const

---------

Co-authored-by: dilyararimovna <dilyara.rimovna@gmail.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-82-243.ec2.internal>

* fix: remove envvars to send from attributes (#102)

* Feat/check before question answering node in meeting analysis (#104)

* first commit for check before call LLM

* condition for calling gpt4: WIP

* condition for calling gpt4: WIP-2

* working check before qa node

* docker container arg SHORT_GENERATIVE_SERVICE everuwhere; fix README

* codestyle

* update docs_in_use; add comment

* move prompts to common

* fix typo

* Feat/summary length options (#105)

* feat: length of summary now controllable

* codestyle

* flag re.IGNORECASE

* gpt4 for response generation and selection in management assistant dists (#106)

* replace chatgpt with gpt4 for response generation and selection in management assistant dists

* add gpt4 container to management_assistant

* also add to dev.yml

* llm-based-response-selector-gpt4

* fixes acc to Dilya

* feat: show up google api skill (#52)

* feat: show up google api skill

* fix: do not use envvars to send in google api skill

* fix: timeout for google api skill

* fix: do not wait for google api

* fix: short_generative_service in correct Dockerfile (#107)

* Feat/nice formatting (#110)

* formatting: first commit

* unify summary descriptions

* formatting for titles completed

* fix compose_variables; fix getting parts of report; fix summary length prompts; fix formatting

* fix: verify=False for getting files

* improve some prompts

* working formatting

* codestyle

* add comments

* formatting fixes

* sent most of logic to utils

* codestyle

* fix: use .env_azure

---------

Co-authored-by: dilyararimovna <dilyara.rimovna@gmail.com>
Co-authored-by: Nika Smilga <42929200+smilni@users.noreply.github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-82-243.ec2.internal>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant