Something weird with Instruct Model #518

y12uc231 · 2024-03-21T22:25:38Z

🐛 Describe the bug

Here is the code I am running. The goal is to get logprob for each token generated by the chat model.

olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-7B-Instruct")
prompt = tokenizer.apply_chat_template(chat, tokenize=False,
                                       add_generation_prompt=True)
inputs = tokenizer.encode(prompt, add_special_tokens=True,
                          return_tensors="pt").to(device)
output = olmo.generate(input_ids=inputs.to(olmo.device),
                       max_new_tokens=10,
                       do_sample=True,
                       top_k=50,
                       top_p=0.95,
                       return_dict_in_generate=True,
                       output_scores=True)
transition_scores = olmo.compute_transition_scores(
            output.sequences, output.scores, normalize_logits=True)

Here is the error when I run the code above.

Traceback (most recent call last):
  File "/n/holylabs/LABS/doshi-velez_lab/Users/skrishna/w2s/self_loop_llm/src/olma.py", line 307, in <module>
    api_loop_call(args, start_prompts, prefix_prompts[args.data_name][args.prefix],  self_correct_prompt, get_test_data(args.data_name, dataset), few_shot_prompt)
  File "/n/holylabs/LABS/doshi-velez_lab/Users/skrishna/w2s/self_loop_llm/src/olma.py", line 181, in api_loop_call
    response = get_llm_prediction_with_logits(prompt, temperature = args.temperature, large_model=args.llm)
  File "/n/holylabs/LABS/doshi-velez_lab/Users/skrishna/w2s/self_loop_llm/src/olma.py", line 88, in get_llm_prediction_with_logits
    transition_scores = olmo.compute_transition_scores(
  File "/n/home02/skrishna/.conda/envs/pt2.1.0_cuda12.1/lib/python3.10/site-packages/transformers/generation/utils.py", line 1235, in compute_transition_scores
    scores = scores.reshape(-1, self.config.vocab_size, scores.shape[-1])
RuntimeError: shape '[-1, 50280, 10]' is invalid for input of size 503040

Here is where the weird part : the size of the output.scores[0] should be [1, vocab_size] where for olmo vocab_size = 50280 but the size of output.scores[0] = [1, 50304] . How come the outcome is not aligned with the vocab_size. Also the value of outcome.scores is mostly -infs.

Versions

Python 3.10.13

The text was updated successfully, but these errors were encountered:

nghtctrl · 2024-04-13T03:30:35Z

@y12uc231 I have not attempted to replicate the issue. However, referencing the OLMo paper, they mention that they expanded the word embedding vocabulary size dimension to 50304 instead of using the true vocabulary size of 50280 to ensure that it would be a multiple of 128 (computational efficiency reasons), which could explain the discrepancies between the shapes you are seeing here.

y12uc231 added the type/bug An issue about a bug label Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Something weird with Instruct Model #518

Something weird with Instruct Model #518

y12uc231 commented Mar 21, 2024 •

edited

nghtctrl commented Apr 13, 2024 •

edited

Something weird with Instruct Model #518

Something weird with Instruct Model #518

Comments

y12uc231 commented Mar 21, 2024 • edited

🐛 Describe the bug

Versions

nghtctrl commented Apr 13, 2024 • edited

y12uc231 commented Mar 21, 2024 •

edited

nghtctrl commented Apr 13, 2024 •

edited