Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring better Q&A generation capabilities #505

Open
quovadim opened this issue Apr 24, 2024 · 0 comments · Fixed by #517
Open

Bring better Q&A generation capabilities #505

quovadim opened this issue Apr 24, 2024 · 0 comments · Fixed by #517
Assignees

Comments

@quovadim
Copy link
Collaborator

No description provided.

@quovadim quovadim self-assigned this Apr 24, 2024
quovadim added a commit that referenced this issue May 20, 2024
closes #505 
Prompts are moved into subclass, better prompts for text generation
added, chain-of-thought capabilities added for prompt

Main new features are:
- Added native and automatic support for JSON generation using openai
format structure
- Added prompt-level validation capabilities for generated answer
- Added classes prompt with automatic validation of prompts and
parameters
- Added support for non-strict queries, meaning that now you can specify
if particular prompt response may fail and it will be still ok
- Added support for chain-of-thought reasoning (take a look at examples
at QNA generation prompts
- Moved prompts into separate txt files
- Now q&a generation can fail generation of some questions without
consequences as long as at least one question is generated
- Changed python version to python 3.11
- Some prompts that were previously returning text now return JSONs 
- Some prompts that were previously returning JSONs now return text
- Signature of generate_response now depends on prompt input parameters
- Refactored all (except ragas) prompts, added examples.
- some try/except blocks were removed due to new system of handling
responses for non-strict queries. Meaning that all exceptions thrown
from response_generator are critical. If NonStrict tag is added, in case
of failed execution result will be None, unless exception is critical
(e.g. non-related to LLM generation, but problem in the code)

Metrics and comparison

Comparison on data generated using this branch

| Metric | New Data, New Prompts | New Data, Old Prompts |

|-------------------------------------------|-----------------------|-----------------------|
| fuzzy | 75.88 | **77.05** |
| bert_all_MiniLM_L6_v2 | **80.58** | 79.60 |
| cosine | **75.21** | 68.88 |
| bert_distilbert_base_nli_stsb_mean_tokens | **76.90** | 76.14 |
| llm_answer_relevance | **72.59** | 67.60 |
| llm_context_precision | **91.03** | 84.62 |

Comparison on data generated using development branch

| Metric | Old Data, New Prompts | Old Data, Old Prompts |

|-------------------------------------------|-----------------------|-----------------------|
| fuzzy | **81.14** | 80.56 |
| bert_all_MiniLM_L6_v2 | **69.71** | 66.58 |
| cosine | **66.78** | 57.51 |
| bert_distilbert_base_nli_stsb_mean_tokens | **70.49** | 67.78 |
| llm_answer_relevance | **62.18** | 57.63 |
| llm_context_precision | **84.62** | 78.85 |

---------

Co-authored-by: Vadim Kirilin <vadimkirilin@Vadims-MacBook-Pro.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant