-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Constitutional AI: Harmlessness from AI Feedback #154
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Gal Leibovich <gleibovich@nvidia.com>
+ replace 'ultrachat' with 'sft_datablend_v1' Signed-off-by: shami nisimov <snisimov@nvidia.com>
Signed-off-by: Gal Leibovich <gleibovich@nvidia.com>
Signed-off-by: Gal Leibovich <gleibovich@nvidia.com>
Signed-off-by: Gal Leibovich <gleibovich@nvidia.com>
Signed-off-by: shami nisimov <snisimov@nvidia.com>
for more information, see https://pre-commit.ci
Signed-off-by: shami nisimov <snisimov@nvidia.com>
for more information, see https://pre-commit.ci
Signed-off-by: Gal Leibovich <gleibovich@nvidia.com>
Signed-off-by: snisimov <snisimov@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completed the copyedits to the rst files identified in Constitutional AI: Harmlessness from AI Feedback
#154 and submitted them for your review. Please let me know if you have any questions.
@gleibovich-nvidia Please review @jgerh 's copyedits |
Signed-off-by: Gal Leibovich <gleibovich@nvidia.com>
Thanks a lot! |
Signed-off-by: shami nisimov <snisimov@nvidia.com>
…pt from name) Signed-off-by: shami nisimov <snisimov@nvidia.com>
for more information, see https://pre-commit.ci
…_inference.py' Signed-off-by: shami nisimov <snisimov@nvidia.com>
@terrykong FYI, we found an issue with the converted tokenizer being used within To resolve this issue, commits 65a9c3b and 2286522 are adding a lightweight inference service overriding the tokenizer to use the HF tokenizer instead, which tokenizes a conversation correctly. |
…the inference service Signed-off-by: shami nisimov <snisimov@nvidia.com>
Signed-off-by: Gal Leibovich <gleibovich@nvidia.com>
Is my understanding correct that the tokenizer within the megatron checkpoint does not tokenize the BOS as you expect? Why not change the prompt template to use the BOS token that the tokenizer expects? I feel like maintaining a light fork of the megatron_gpt_eval.py should be a last resort. |
@gleibovich-nvidia Gal, I reviewed your comments, responded to your questions, and completed my review. |
Yes, the BOS for Mistral models is We prompt a trained instruct model ( While I do agree it is not ideal to maintain code adapted from NeMo, I would like to stress that:
There might be another way to workaround this issue that I can try on Sunday morning. Let's see how that goes. |
Thanks a lot @jgerh! |
Signed-off-by: Gal Leibovich <gleibovich@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completed reviews and updates from Gal Leibovich.
I feel like I'm missing something b/c shouldn't convert_mistral_7b_hf_to_nemo.py add the tokenizer that the huggingface checkpoint was trained with? If for some reason you need a different one, I feel like the functionality should just be added to
That's true that one script may not be comprehensive, but this example seems to just be about a tokenizer being different, so I feel like that's not a big enough reason to fork. Tokenizers are usually part of the package deal for models since swapping tokenizers can usually result in garbage when the token-ids get remapped, so I prefer to address this in the model packaging/conversion rather at inference time when swapping the tokenizer is not so much a feature, but an avenue for user error.
Sure, but these are not examples to follow, but exceptions. Also, the second one even says it should be deleted. In general, forking puts the burden on us to maintain the functionality and correctness. I would rather us focus on the algorithms here and push for Nemo/Megatron to incorporate our features when possible. If we ask and they say "no" or "not anytime soon", then I see that as a good reason for when we should fork. |
Signed-off-by: snisimov <snisimov@nvidia.com>
…ate_sl_cai_dataset.py') Signed-off-by: snisimov <snisimov@nvidia.com>
…ate_rl_cai_dataset.py) Signed-off-by: snisimov <snisimov@nvidia.com>
…same index to both 'chosen' and 'rejected' responses) Signed-off-by: snisimov <snisimov@nvidia.com>
Draft for the implementation of Constitutional AI: Harmlessness from AI Feedback.
Note that there are still some minor missing pieces regarding the exact model and datasets to be used, yet the main building blocks are in place.