Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Add Model] Pairwise Preference Model #123

Merged
merged 9 commits into from May 14, 2024

Conversation

WeiXiongUST
Copy link
Contributor

Could you help to add the new pairwise preference model RLHFlow/pair-preference-model-LLaMA3-8B?

The usage of the model is similar to the pairRM where we input a prompt and two responses, and the model will return the probability of the first response being preferred. I try to implement a pipeline in rewardbench/models/pairpm.py and also attach an example to use the model for your reference. I am wondering how should we merge such a customized model into the reward bench. Many thanks in advance!

The benchmark results are as follows.

  df_acc = pd.concat([df_acc, pd.DataFrame(row)], ignore_index=True)
     category                 subset  accuracy      n
0        chat        alpacaeval-easy  0.980000  100.0
1        chat      alpacaeval-length  0.978947   95.0
2        chat        alpacaeval-hard  0.989474   95.0
3        chat          mt-bench-easy  1.000000   28.0
4        chat           mt-bench-med  1.000000   40.0
5   chat-hard          mt-bench-hard  0.756757   37.0
6   chat-hard         llmbar-natural  0.900000  100.0
7   chat-hard  llmbar-adver-neighbor  0.522388  134.0
8   chat-hard   llmbar-adver-GPTInst  0.619565   92.0
9   chat-hard    llmbar-adver-GPTOut  0.680851   47.0
10  chat-hard    llmbar-adver-manual  0.500000   46.0
11     safety     refusals-dangerous  0.930000  100.0
12     safety     refusals-offensive  0.970000  100.0
13     safety   xstest-should-refuse  0.954545  154.0
14     safety  xstest-should-respond  0.968000  250.0
15     safety            donotanswer  0.625000  136.0
16  reasoning               math-prm  0.948546  447.0
17  reasoning                hep-cpp  0.939024  164.0
18  reasoning                 hep-go  0.945122  164.0
19  reasoning               hep-java  0.975610  164.0
20  reasoning                 hep-js  0.951220  164.0
21  reasoning             hep-python  0.975610  164.0
22  reasoning               hep-rust  0.914634  164.0

I created a PairPMPipeline class to use the pair preference model. I also presented an example to use the preference model.
Copy link
Collaborator

@natolambert natolambert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WeiXiongUST how much of this can be merged with the existing code for PairRM or at least put in the same file? https://github.com/allenai/reward-bench/blob/main/rewardbench/models/pairrm.py

Otherwise LGTM (style is pending)

The training and use of the models are similar to that of Slic paper SLiC-HF: Sequence Likelihood Calibration with Human Feedback.
@WeiXiongUST
Copy link
Contributor Author

While this preference model is also for pairwise comparison, the training and use are quite different from pairRM. I think we can refer it to as the slicpairpm as it is most similar to that of SLiC-HF: Sequence Likelihood Calibration with Human Feedback.

Copy link
Collaborator

@natolambert natolambert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes to make sure the scripts work. Sorry they're not documented better, will come soon!


class SlicPairPMPipeline:

def __init__(self, model_path):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this needs to be modified to match the loading in the scripts.
See

reward_pipe = pipeline_builder(

Mostly need to take in the args, and if they are not all used that's also fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified accordingly. But since it needs an additional tokenizer to prepare the pair (x, a1, a2) as the input, I currently load an additional tokenizer by

self.tokenizer_data_format = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct", use_fast=True)

rewardbench/models/__init__.py Outdated Show resolved Hide resolved
WeiXiongUST and others added 3 commits May 12, 2024 10:10
Co-authored-by: Nathan Lambert <nathanl@allenai.org>
we now use task, model, and tokenizer to init the pipeline.
@natolambert
Copy link
Collaborator

@WeiXiongUST just need to run the following (I think)

make style
make quality

@WeiXiongUST
Copy link
Contributor Author

@WeiXiongUST just need to run the following (I think)

make style
make quality

Have tested these two commands locally!

Copy link
Collaborator

@natolambert natolambert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! @WeiXiongUST send me the scores and I'll upload or I'll run it soon.

@natolambert natolambert merged commit ad38d67 into allenai:main May 14, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants