Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable feedback level #1105

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

arn-tru
Copy link
Contributor

@arn-tru arn-tru commented Apr 29, 2024

Instead of providing rating for llm generated feedback between 0.1 to 1 in increments of .1, offer options to do it increments of 0.2 or 0.33

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Apr 29, 2024
@arn-tru arn-tru requested a review from piotrm0 April 29, 2024 09:11
@joshreini1
Copy link
Contributor

The prompting still needs to be changed in concert with the normalization. Now (for llm-based feedback) we prompt the LLM to score 0-10. If we go with fewer levels, that prompting should also change to 0-4, 0-2, etc.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Apr 30, 2024
@piotrm0
Copy link
Contributor

piotrm0 commented Apr 30, 2024

The prompting still needs to be changed in concert with the normalization. Now (for llm-based feedback) we prompt the LLM to score 0-10. If we go with fewer levels, that prompting should also change to 0-4, 0-2, etc.

Perhaps the current PR could instead refer to score range then? Normalizing by something less than 10 with current prompts will produce scores with ranges outside of 0-1 but unsure if this was intention.

Copy link
Contributor

@piotrm0 piotrm0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Figure out phrasing around feature, is it about the normalized score range being different than [0-1] or is it about the resolution of the prompted score (but still normalized to [0-1]). The latter requires the changes Josh mentioned.

@joshreini1
Copy link
Contributor

Figure out phrasing around feature, is it about the normalized score range being different than [0-1] or is it about the resolution of the prompted score (but still normalized to [0-1]). The latter requires the changes Josh mentioned.

It's the latter - offering variable resolutions is the goal here. For many cases, clients have observed that LLMs are unable to properly distinguish 11 levels (0-10) of feedback and find that 3-5 levels work better. The goal of the PR is to support that workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants