Use generation_config.json #646

haqishen · 2024-03-26T05:35:56Z

close #566
close #568

Check1

train->valid: output1
train->valid->Wave UI chat: output1.1
pull from HF->valid: output2
pull from HF->pipeline generation: output2.1
pull from HF->Wave UI chat: output2.2

I have output1.1 == output2 == output2.1 == output2.2 but slightly different from output1 when do_sample=False

Check2

from transformers import pipeline

generate_text = pipeline(
    model="haqishen/genconf-test",
    torch_dtype="auto",
    trust_remote_code=True,
    use_fast=True,
    device_map={"": "cuda:2"},
    token=True,
)

res = generate_text(
    input_text,
    # min_new_tokens=2,
    # max_new_tokens=256,
    # do_sample=False,
    # num_beams=1,
    # temperature=float(0.0),
    # repetition_penalty=float(1.0),
    # renormalize_logits=True
)
print(res[0]["generated_text"])

In this segment of code from model card, whether these parameters are commented out or not, the output remains the same.

Check3

generate_text.model.generation_config outputs:

GenerationConfig {
  "bos_token_id": 1,
  "eos_token_id": 2,
  "max_new_tokens": 256,
  "max_time": 120.0,
  "min_new_tokens": 2,
  "pad_token_id": 0,
  "temperature": null,
  "top_k": null,
  "top_p": null
}

shows that the generation_config.json is automatically loaded when creating pipeline.

Check4

Both DDP and Deepspeed works well with generation_config.json

pascal-pfeiffer

Thank you @haqishen
Please see my in-code comments.

Please also adapt the model cards to reflect the usage of the generation config so that it is clear that base settings are coming from the model (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model).

probably something like this:

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("h2oai/h2o-danube-1.8b-chat")
model = AutoModelForCausalLM.from_pretrained("h2oai/h2o-danube-1.8b-chat")

print(model.generation_config)
# ... base settings output here

# change default parameters here:
model.generation_config.temperature=0.8

print(model.generation_config)
# ... changed output here

inputs = tokenizer("Why is water good for you?", return_tensors="pt")
outputs = model.generate(**inputs)
# ...

Furthermore, this PR sets the base for a larger change including the chat template into the generation. #551

pascal-pfeiffer · 2024-03-27T08:36:05Z

llm_studio/app_utils/hugging_face_utils.py

+    # push generation_config to hub
+    if cfg.problem_type not in NON_GENERATION_PROBLEM_TYPES:
+        model.backbone.generation_config.push_to_hub(
+            repo_id=repo_id,
+            private=True,
+            commit_message="Upload generation_config.json",
+        )
+


generation_config.json is pushed with the model automatically, this this is redundant and will give a 0 file change:

pascal-pfeiffer · 2024-03-27T08:40:26Z

llm_studio/src/utils/modeling_utils.py

+            cfg.prediction.temperature if cfg.prediction.do_sample else None
+        )
+        backbone.generation_config.top_k = (
+            cfg.prediction.top_k if cfg.prediction.do_sample else None
+        )
+        backbone.generation_config.top_p = (
+            cfg.prediction.top_p if cfg.prediction.do_sample else None
+        )


let's not set these at all when do_sample is False

pascal-pfeiffer

lgtm, thanks for the quick changes

edit: actually, there are more model cards that may need to be touched. But let's merge and address that with the chat template change.

haqishen added 6 commits March 18, 2024 18:09

generation config

e7a31a5

fix bug

98442ed

modify commmit msg

217bed4

make style

f8ae837

fix for cls

71c23a2

fix for cls

ed764cf

haqishen requested a review from pascal-pfeiffer March 27, 2024 04:44

pascal-pfeiffer requested changes Mar 27, 2024

View reviewed changes

haqishen added 3 commits March 28, 2024 17:58

remove redundant code

e051869

not set temp&topk&topp when do_sample=False

83a75eb

modify model card

19bbfd3

pascal-pfeiffer approved these changes Mar 28, 2024

View reviewed changes

haqishen merged commit c732bad into main Mar 28, 2024
5 checks passed

haqishen deleted the gen_conf branch March 28, 2024 10:45

This was referenced Apr 24, 2024

[BUG] Chat window generation parameters not updated #679

Closed

[CODE IMPROVEMENT] Use GenerationConfig #206

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use generation_config.json #646

Use generation_config.json #646

haqishen commented Mar 26, 2024 •

edited

pascal-pfeiffer left a comment

pascal-pfeiffer Mar 27, 2024

pascal-pfeiffer Mar 27, 2024

pascal-pfeiffer left a comment •

edited

Use generation_config.json #646

Use generation_config.json #646

Conversation

haqishen commented Mar 26, 2024 • edited

Check1

Check2

Check3

Check4

pascal-pfeiffer left a comment

Choose a reason for hiding this comment

pascal-pfeiffer Mar 27, 2024

Choose a reason for hiding this comment

pascal-pfeiffer Mar 27, 2024

Choose a reason for hiding this comment

pascal-pfeiffer left a comment • edited

Choose a reason for hiding this comment

haqishen commented Mar 26, 2024 •

edited

pascal-pfeiffer left a comment •

edited