Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data split #4

Open
xiaoyuxin1002 opened this issue Jul 24, 2023 · 1 comment
Open

data split #4

xiaoyuxin1002 opened this issue Jul 24, 2023 · 1 comment

Comments

@xiaoyuxin1002
Copy link

Hi, after reading your code, can i check with you the following regarding how you split your dataset?

  1. 5 samples are generated from prompt_gen_data to induce the instruction from the open-source LLM
  2. 20 samples are generated from eval_data to evaluate the quality of the induced instruction during BO iterations
  3. 100 samples are generated from test_data to evaluate the quality of the proposed instruction after BO iterations
    Thanks!
@xiaoyuxin1002
Copy link
Author

Hi, it also seems that your data/instruction_induction_raw/induce folder contains 33 datasets while your data/instruction_induction_raw/execute folder contains only 31 datasets.
And Table 2 in your paper didn't include the tuned prompts for word_in_context, but had 2 rows both showing the prompts for negation.
Could you please fix them? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant