Dataset v2 discussion & feedback #88

natolambert · 2024-03-26T15:02:21Z

Hey! Post any questions or complaints on the dataset. We'll log our internal goals and limitations here too.

It was pointed out by Rishabh Agarwal that the PRM Math subset has two structural issues. 1) we added newlines to the human reference answers (debatably could be called a bug). 2) with GPT4 always as rejected, some models may be biased there.

natolambert · 2024-04-03T00:46:37Z

Idea: Now that we have a bunch of RMs, we can see if there are any datapoints that the models all think are wrong and double check our labels for future releases.

natolambert added the question Further information is requested label Mar 26, 2024

natolambert pinned this issue Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset v2 discussion & feedback #88

Dataset v2 discussion & feedback #88

natolambert commented Mar 26, 2024

natolambert commented Apr 3, 2024

Dataset v2 discussion & feedback #88

Dataset v2 discussion & feedback #88

Comments

natolambert commented Mar 26, 2024

natolambert commented Apr 3, 2024