Схема архитектуры шакальная #110

oserikov · 2022-02-23T13:16:51Z

https://github.com/deepmipt/dream/blob/main/DREAM.png

* formatting: first commit * unify summary descriptions * formatting for titles completed * fix compose_variables; fix getting parts of report; fix summary length prompts; fix formatting * fix: verify=False for getting files * improve some prompts * working formatting * codestyle * add comments * formatting fixes * sent most of logic to utils * codestyle

* feat: Azure OpenAI * fix: black * refactor: changed davinci3 to 2 * fix: tests * refactor: added missing newline * refactor: code formatting * fix: use .env_secret_azure for additional env vars for azure * fix: use .env_azure for public services * feat: azure api variables * fix: use .env_azure for public services * feat: created .env_secret_azure * fix: use .env_azure for management assistants * Feat/doc skills turnon logic to common (#94) * move doc skills logic to common; introduce it to desc based skill selector * turn on doc-based skills if we have doc in use for desc based skill selector; complex checks for llm based skill selector * remove dff_meeting_analysis_skill from automatically added skills * add comment about turning on doc based skills * add doc-skill turn on logic to universal llm-based skill selector; also fix the issue with activating all skills from pipeline if there is an exception * codestyle * remove extra list(set()) * fixes acc to Dilya * fix: skill selection logic with docs also * fix: codestyle * codestyle * remove N_TURNS_TO_KEEP_DOC from skill selector --------- Co-authored-by: dilyararimovna <dilyara.rimovna@gmail.com> * Feat/weekly with separate files (#99) * feat: management distribution * fix: prompt selector for roles * first commit for meeting analysis * working distribution, but no meeting analysis yet * prototype files * prompts * dff_meeting_analysis_skill instead of prompted; llm-based everything * working version of meeting analysis skill * dff_meeting_analysis_skill with 4 nodes * doc-processor annotator * added saving previous meeting analysis results; links to them are written to bot attrs * update roles * fix for meeting analysis skill, now working * document only for now, then will be deleted * prompt for unabridged response selection * refactor doc_processor, remove unnecessary funcs * better prompts * better skill description in components * add llm-based-skill-selector to dist * enable finding previously generated meeting analyses; better fallback * 512 max_tokens for chatgpt in some cases * enhance response selector prompt * add dff_meeting_analysis_skill_formatter * some fixes to cards and configs * update readmes * correct ports for doc processor; remove extra prompt * codestyle * codestyle * fixes for Dilya * enhanced checks * typo * codestyle + small fix for checks * file moved to google drive * remove extra print * checking each file if processed; concatenating multiple files; two containers for doc-processor * typo fix * unique ids for files in data/, ids to paths in config * delete transcript files * codestyle * fix: UIDs for files in data now working * fixes in working with files * codestyle * fix error in getting related_files * Revert "fix error in getting related_files" This reverts commit 705e23897e9317e1ba24702b14e7c097da093dcd. * working fix for bot_attrs_files * remove document file * numerous fixes for review * codestyle * bring some things to common * even better funcs in common * codestyle * saving all processed docs in atts; saving candidate texts in adds of utt; link or path possible for processing from atts * fixes for accidentally broken stuff * some more fixes * candidate texts to hyp attributes * codestyle * FILE_SERVER_TIMEOUT as arg * GENERATIVE_SERVICE_URL as arg * fix: formatters in pipeline_conf * component card for vectorize_documents * openai-chatgpt-long.json for document-qa-llm-skill * openai-chatgpt-long.json for meeting-analysis-skill * fix: timeouts and component card paths * add regex for http check * doc processor names in service_config files * update getting envvars * codestyle * fix: remove envvars from everywhere * fix: remove envvars from everywhere * fixes: details in cards and pipeline * fixes: details in cards and pipeline * feat: special message if failed to process file from atts * get token limit from service endpoint * fix: better upload_document, try except inside func & enable both text and file upload in one func * docstrings; also fix: detecting extension for links * codestyle * again codestyle * update READMEs with dialog state info * fix: add diff endpoints to doc-retriever readme * fix: solve inconsistencies in cards and readmes * fix: incorrect formatters in cards * update ports to non-allocated ones * fixes: everything acc to comments * codestyle * generalize file service url in another comment * codestyle * refactor attributes structure * update readmes to include info about new attributes format * fix: clean config; comment about format * add comments; {FILE_SERVER_URL} instead of actual url * comments and readmes * implement storing doc for N_TURNS_FOR_DISCUSSION turns * codestyle * improve N_TURNS_FOR_DISCUSSION, implement only for doc-processor-from-atts * better logging in doc-retriever * codestyle * more comments * codestyle * delete extra logs * some more comments * count n_steps_discussed in any case; put that to readme * fix: n_steps_discussed in correct place * fix: if file was processed earlier, take processed text from processed_documents * if we get doc from somewhere, consider it good as new -> reset n_steps_discussed to 0 * codestyle * update comments; fix logic of n_steps_discussed * better comments * fix: small fixes * N_TURNS_FOR_DISCUSSION: -> N_TURNS_TO_KEEP_DOC * N_TURNS_TO_KEEP_DOC in distribution files * N_TURNS_TO_KEEP_DOC: 10 ->; also updates in readmes and comments * codestyle * comment about N_TURNS_TO_KEEP_DOC * comment about N_TURNS_TO_KEEP_DOC * fix: remove sentseg from management dist * better descriptions for skills * fix hyp format for dff_meeting_analysis_skill * fixes: remove logs, improve skill description * ensure unique ids everywhere; add dialog_id to file_id * update skill selector: turn off doc-based skills when we don't have doc * codestyle * codestyle again * remove one extra log * now we can also process files from file server * codestyle * fix: is_container_running to response.py * fix to prompts; also longer context for many services * always turn on document-based wa skill * codestyle * add file exists check * start adding question_answering default node * node for question answering in meeting analysis skill; small change in llm-based-skill-selector * codestyle * condition file * Dilya's fixes for skill-selector * codestyle * slightly improve prompt for response selector * fix: chunks only split by newlines * fix: no extra info in prompts; better response selector * small fixes * codestyle * added list title * codestyle * codestyle * moved is_container_running up * fix: tags: selector * add check if skill to add is in pipeline * shorter prompt for response selector * copy older dist with tf-idf qa as management_assistant_extended * remove tf-idf qa skill from management assistant * update description for meeting analysis skill * remover doc-retriever from main distribution * better guidance for qa * feat: turn on dff_meeting_analysis_skill when it was used with the same doc before * codestyle * codestyle * fix: only perform doc-related checks in skill selector if we actually have a doc in use * fix: include situation when we don't have prev_skills or prev_docs in skill selector * use gpt4 for meeting analysis skill * feat: add progress by areas * improve prompts * gpt-4 response selector * feat: weekly reports, draft * improved prompts for showing titles * huge timeout * add re.DOTALL flag * fix: regex for conditions * now working with separate files in use * update attributes format (for docs_in_use) * update test files for new attribute format * codestyle * update annotator readmes * update skill for new attributes format * improve comments * switch to chatgpt * fixing conflicts from merge * fix things lost during merge * codestyle * add some more accidentally lost info * return accidentally lost change * changes for Dilya * filetype exception - remove logging * remove sentry from utils.py * flake8 improve work with exception; update info about meeting skill in extended dist * update envvars * remove unnecessary const --------- Co-authored-by: dilyararimovna <dilyara.rimovna@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-82-243.ec2.internal> * fix: remove envvars to send from attributes (#102) * Feat/check before question answering node in meeting analysis (#104) * first commit for check before call LLM * condition for calling gpt4: WIP * condition for calling gpt4: WIP-2 * working check before qa node * docker container arg SHORT_GENERATIVE_SERVICE everuwhere; fix README * codestyle * update docs_in_use; add comment * move prompts to common * fix typo * Feat/summary length options (#105) * feat: length of summary now controllable * codestyle * flag re.IGNORECASE * gpt4 for response generation and selection in management assistant dists (#106) * replace chatgpt with gpt4 for response generation and selection in management assistant dists * add gpt4 container to management_assistant * also add to dev.yml * llm-based-response-selector-gpt4 * fixes acc to Dilya * feat: show up google api skill (#52) * feat: show up google api skill * fix: do not use envvars to send in google api skill * fix: timeout for google api skill * fix: do not wait for google api * fix: short_generative_service in correct Dockerfile (#107) * Feat/nice formatting (#110) * formatting: first commit * unify summary descriptions * formatting for titles completed * fix compose_variables; fix getting parts of report; fix summary length prompts; fix formatting * fix: verify=False for getting files * improve some prompts * working formatting * codestyle * add comments * formatting fixes * sent most of logic to utils * codestyle * fix: use .env_azure --------- Co-authored-by: dilyararimovna <dilyara.rimovna@gmail.com> Co-authored-by: Nika Smilga <42929200+smilni@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-82-243.ec2.internal>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Схема архитектуры шакальная #110

Схема архитектуры шакальная #110

oserikov commented Feb 23, 2022

Схема архитектуры шакальная #110

Схема архитектуры шакальная #110

Comments

oserikov commented Feb 23, 2022