-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use generation_config.json #646
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @haqishen
Please see my in-code comments.
Please also adapt the model cards to reflect the usage of the generation config so that it is clear that base settings are coming from the model (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model).
probably something like this:
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("h2oai/h2o-danube-1.8b-chat")
model = AutoModelForCausalLM.from_pretrained("h2oai/h2o-danube-1.8b-chat")
print(model.generation_config)
# ... base settings output here
# change default parameters here:
model.generation_config.temperature=0.8
print(model.generation_config)
# ... changed output here
inputs = tokenizer("Why is water good for you?", return_tensors="pt")
outputs = model.generate(**inputs)
# ...
Furthermore, this PR sets the base for a larger change including the chat template into the generation. #551
# push generation_config to hub | ||
if cfg.problem_type not in NON_GENERATION_PROBLEM_TYPES: | ||
model.backbone.generation_config.push_to_hub( | ||
repo_id=repo_id, | ||
private=True, | ||
commit_message="Upload generation_config.json", | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cfg.prediction.temperature if cfg.prediction.do_sample else None | ||
) | ||
backbone.generation_config.top_k = ( | ||
cfg.prediction.top_k if cfg.prediction.do_sample else None | ||
) | ||
backbone.generation_config.top_p = ( | ||
cfg.prediction.top_p if cfg.prediction.do_sample else None | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not set these at all when do_sample
is False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks for the quick changes
edit: actually, there are more model cards that may need to be touched. But let's merge and address that with the chat template change.
close #566
close #568
Check1
train->valid: output1
train->valid->Wave UI chat: output1.1
pull from HF->valid: output2
pull from HF->pipeline generation: output2.1
pull from HF->Wave UI chat: output2.2
I have output1.1 == output2 == output2.1 == output2.2 but slightly different from output1 when
do_sample=False
Check2
In this segment of code from model card, whether these parameters are commented out or not, the output remains the same.
Check3
generate_text.model.generation_config
outputs:shows that the generation_config.json is automatically loaded when creating pipeline.
Check4
Both DDP and Deepspeed works well with generation_config.json