Mamba: `use_cache` is not passed through in `prepare_inputs_for_generation` #30849

uwu-420 · 2024-05-16T09:08:34Z

Hi :)

I think that use_cache is supposed to be passed through here as well:

transformers/src/transformers/models/mamba/modeling_mamba.py

Lines 634 to 647 in 9fd606d

    
           def prepare_inputs_for_generation( 
        
               self, input_ids, cache_params: Optional[MambaCache] = None, inputs_embeds=None, attention_mask=None, **kwargs 
        
           ): 
        
               # only last token for inputs_ids if the state is passed along. 
        
               if cache_params is not None: 
        
                   input_ids = input_ids[:, -1].unsqueeze(-1) 
        
               if inputs_embeds is not None and cache_params is None: 
        
                   model_inputs = {"inputs_embeds": inputs_embeds} 
        
               else: 
        
                   model_inputs = {"input_ids": input_ids} 
        
               model_inputs["cache_params"] = cache_params 
        
               return model_inputs

I noticed when I wanted to get the cache when using model.generate, but it was not there although I set use_cache=True.

Edit: Just saw that GenerateDecoderOnlyOutput would have to be adjusted as well. It would need to contain cache_params similarly to past_key_values. I don't know if it's okay for you to bloat that even more.

transformers/src/transformers/generation/utils.py

Lines 2244 to 2251 in 9fd606d

    
           return GenerateDecoderOnlyOutput( 
        
               sequences=input_ids, 
        
               scores=scores, 
        
               logits=raw_logits, 
        
               attentions=decoder_attentions, 
        
               hidden_states=decoder_hidden_states, 
        
               past_key_values=model_kwargs.get("past_key_values"), 
        
           )

Cheers and thanks for the great work!

The text was updated successfully, but these errors were encountered:

amyeroberts · 2024-05-16T09:20:57Z

cc @gante

gante · 2024-05-28T18:51:44Z

@zucchini-nlp could you have a look at this issue? 🤗

zucchini-nlp linked a pull request May 29, 2024 that will close this issue

Make mamba use cache #31116

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mamba: `use_cache` is not passed through in `prepare_inputs_for_generation` #30849

Mamba: `use_cache` is not passed through in `prepare_inputs_for_generation` #30849

uwu-420 commented May 16, 2024 •

edited

amyeroberts commented May 16, 2024

gante commented May 28, 2024

Mamba: use_cache is not passed through in prepare_inputs_for_generation #30849

Mamba: use_cache is not passed through in prepare_inputs_for_generation #30849

Comments

uwu-420 commented May 16, 2024 • edited

amyeroberts commented May 16, 2024

gante commented May 28, 2024

Mamba: `use_cache` is not passed through in `prepare_inputs_for_generation` #30849

Mamba: `use_cache` is not passed through in `prepare_inputs_for_generation` #30849

uwu-420 commented May 16, 2024 •

edited