Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

crash when running mosaicml/mpt-7b-* models: KeyError: 'attention_mask' #213

Open
tarasglek opened this issue Jun 12, 2023 · 2 comments
Open
Labels
bug Something isn't working

Comments

@tarasglek
Copy link

from basaran.model import load_model

model = load_model('mosaicml/mpt-7b-storywriter',  trust_remote_code=True, load_in_8bit=True,)

for choice in model("once upon a time"):
    print(choice)
Traceback (most recent call last):
  File "/home/taras/Documents/ctranslate2/basaran/run.py", line 7, in <module>
    for choice in model("once upon a time"):
  File "/home/taras/Documents/ctranslate2/basaran/.venv/lib/python3.9/site-packages/basaran/model.py", line 73, in __call__
    for (
  File "/home/taras/Documents/ctranslate2/basaran/.venv/lib/python3.9/site-packages/basaran/model.py", line 233, in generate
    inputs = self.model.prepare_inputs_for_generation(
  File "/home/taras/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-storywriter/8667424ea9d973d3c01596fcbb86a3a8bc164299/modeling_mpt.py", line 280, in prepare_inputs_for_generation
    attention_mask = kwargs['attention_mask'].bool()
KeyError: 'attention_mask'
@tarasglek tarasglek changed the title crash when running storywriter model crash when running storywriter model: KeyError: 'attention_mask' Jun 12, 2023
@tarasglek tarasglek changed the title crash when running storywriter model: KeyError: 'attention_mask' crash when running mosaicml/mpt-7b-* models: KeyError: 'attention_mask' Jun 12, 2023
@tarasglek
Copy link
Author

Same thing happens with mosaicml/mpt-7b-instruct

@fardeon
Copy link
Member

fardeon commented Jun 13, 2023

It appears that the error originates from the internal code of MPT. We will conduct further testing.

@fardeon fardeon added the bug Something isn't working label Jun 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants