Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance Request for Reproducing OpenbookQA Dataset Results #49

Open
FairyFali opened this issue Nov 17, 2023 · 1 comment
Open

Guidance Request for Reproducing OpenbookQA Dataset Results #49

FairyFali opened this issue Nov 17, 2023 · 1 comment

Comments

@FairyFali
Copy link

Hi,

I am encountering difficulties in reproducing the experimental results on the OpenbookQA dataset. The output format is unexpected; for instance, I'm getting responses like "1 is correct. 2 is incorrect. 3 is incorrect. 4 is incorrect.", whereas the anticipated format should be "answer1". Could you please provide a detailed command or set of instructions for both fine-tuning and evaluating the model, to enable accurate reproduction of the results on OpenbookQA?

@HZQ950419
Copy link
Collaborator

Hi,

I have replied to your email in case you haven't seen it.
You can refer to issue #38 and we use commonsense_evaluate.py for evaluation. BTW, please try to use single GPU for training, multi-GPU training may not reproduce the results. We are still trying to figure out the reason.

If you have further questions, please let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants