[Add Model] Pairwise Preference Model #123

WeiXiongUST · 2024-05-09T11:33:56Z

Could you help to add the new pairwise preference model RLHFlow/pair-preference-model-LLaMA3-8B?

The usage of the model is similar to the pairRM where we input a prompt and two responses, and the model will return the probability of the first response being preferred. I try to implement a pipeline in rewardbench/models/pairpm.py and also attach an example to use the model for your reference. I am wondering how should we merge such a customized model into the reward bench. Many thanks in advance!

The benchmark results are as follows.

  df_acc = pd.concat([df_acc, pd.DataFrame(row)], ignore_index=True)
     category                 subset  accuracy      n
0        chat        alpacaeval-easy  0.980000  100.0
1        chat      alpacaeval-length  0.978947   95.0
2        chat        alpacaeval-hard  0.989474   95.0
3        chat          mt-bench-easy  1.000000   28.0
4        chat           mt-bench-med  1.000000   40.0
5   chat-hard          mt-bench-hard  0.756757   37.0
6   chat-hard         llmbar-natural  0.900000  100.0
7   chat-hard  llmbar-adver-neighbor  0.522388  134.0
8   chat-hard   llmbar-adver-GPTInst  0.619565   92.0
9   chat-hard    llmbar-adver-GPTOut  0.680851   47.0
10  chat-hard    llmbar-adver-manual  0.500000   46.0
11     safety     refusals-dangerous  0.930000  100.0
12     safety     refusals-offensive  0.970000  100.0
13     safety   xstest-should-refuse  0.954545  154.0
14     safety  xstest-should-respond  0.968000  250.0
15     safety            donotanswer  0.625000  136.0
16  reasoning               math-prm  0.948546  447.0
17  reasoning                hep-cpp  0.939024  164.0
18  reasoning                 hep-go  0.945122  164.0
19  reasoning               hep-java  0.975610  164.0
20  reasoning                 hep-js  0.951220  164.0
21  reasoning             hep-python  0.975610  164.0
22  reasoning               hep-rust  0.914634  164.0

I created a PairPMPipeline class to use the pair preference model. I also presented an example to use the preference model.

natolambert

@WeiXiongUST how much of this can be merged with the existing code for PairRM or at least put in the same file? https://github.com/allenai/reward-bench/blob/main/rewardbench/models/pairrm.py

Otherwise LGTM (style is pending)

The training and use of the models are similar to that of Slic paper SLiC-HF: Sequence Likelihood Calibration with Human Feedback.

WeiXiongUST · 2024-05-10T03:01:14Z

While this preference model is also for pairwise comparison, the training and use are quite different from pairRM. I think we can refer it to as the slicpairpm as it is most similar to that of SLiC-HF: Sequence Likelihood Calibration with Human Feedback.

natolambert

Minor changes to make sure the scripts work. Sorry they're not documented better, will come soon!

natolambert · 2024-05-10T22:28:12Z

rewardbench/models/slicpairpm.py

+
+class SlicPairPMPipeline:
+
+    def __init__(self, model_path):


Also, this needs to be modified to match the loading in the scripts.
See

reward-bench/scripts/run_rm.py

Line 173 in a7cf68b

reward_pipe = pipeline_builder(

Mostly need to take in the args, and if they are not all used that's also fine.

Modified accordingly. But since it needs an additional tokenizer to prepare the pair (x, a1, a2) as the input, I currently load an additional tokenizer by

self.tokenizer_data_format = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct", use_fast=True)

rewardbench/models/__init__.py

Co-authored-by: Nathan Lambert <nathanl@allenai.org>

we now use task, model, and tokenizer to init the pipeline.

natolambert · 2024-05-13T16:31:17Z

@WeiXiongUST just need to run the following (I think)

make style
make quality

This reverts commit d66b833.

WeiXiongUST · 2024-05-14T01:57:28Z

@WeiXiongUST just need to run the following (I think)
make style
make quality

Have tested these two commands locally!

natolambert

Great! @WeiXiongUST send me the scores and I'll upload or I'll run it soon.

WeiXiongUST added 2 commits May 9, 2024 19:19

Create pairpm.py

d7e664b

I created a PairPMPipeline class to use the pair preference model. I also presented an example to use the preference model.

add pairpm pipeline

a374ec1

natolambert requested changes May 9, 2024

View reviewed changes

change the name to slicpairpm

f727b8c

The training and use of the models are similar to that of Slic paper SLiC-HF: Sequence Likelihood Calibration with Human Feedback.

natolambert requested changes May 10, 2024

View reviewed changes

WeiXiongUST and others added 3 commits May 12, 2024 10:10

Update rewardbench/models/__init__.py

b8b53e9

Co-authored-by: Nathan Lambert <nathanl@allenai.org>

modify interface

a533f75

adjust pipeline builder

e570d0d

we now use task, model, and tokenizer to init the pipeline.

WeiXiongUST added 3 commits May 14, 2024 09:20

improve code quality

d66b833

Revert "improve code quality"

a9f8217

This reverts commit d66b833.

improve style and quality

098aef6

natolambert approved these changes May 14, 2024

View reviewed changes

natolambert merged commit ad38d67 into allenai:main May 14, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Add Model] Pairwise Preference Model #123

[Add Model] Pairwise Preference Model #123

WeiXiongUST commented May 9, 2024

natolambert left a comment

WeiXiongUST commented May 10, 2024

natolambert left a comment

natolambert May 10, 2024

WeiXiongUST May 13, 2024

natolambert commented May 13, 2024

WeiXiongUST commented May 14, 2024

natolambert left a comment

[Add Model] Pairwise Preference Model #123

[Add Model] Pairwise Preference Model #123

Conversation

WeiXiongUST commented May 9, 2024

natolambert left a comment

Choose a reason for hiding this comment

WeiXiongUST commented May 10, 2024

natolambert left a comment

Choose a reason for hiding this comment

natolambert May 10, 2024

Choose a reason for hiding this comment

WeiXiongUST May 13, 2024

Choose a reason for hiding this comment

natolambert commented May 13, 2024

WeiXiongUST commented May 14, 2024

natolambert left a comment

Choose a reason for hiding this comment