Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prepare_dataset.py issue #1582

Open
Fred-cell opened this issue May 12, 2024 · 1 comment
Open

prepare_dataset.py issue #1582

Fred-cell opened this issue May 12, 2024 · 1 comment
Assignees
Labels
bug Something isn't working triaged Issue has been triaged by maintainers

Comments

@Fred-cell
Copy link

Fred-cell commented May 12, 2024

when I prepare dataset for gptManagerBenchmark, I encountered an issue as below:(v0.9.0)
]#python prepare_dataset.py --request-rate -1 --time-delay-dist constant --time-delay-dist constant --tokenizer /code/tensorrt-llm/chatglm3-6b/ token-norm-dist --num-requests 16 --input-mean 1024 --input-stdev 0 --output-mean 512 --output-stdev 0
The repository for /code/tensorrt-llm/chatglm3-6b/ contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//code/tensorrt-llm/chatglm3-6b/.
You can avoid this prompt in future by passing the argument trust_remote_code=True.

Do you wish to run the custom code? [y/N] y
Traceback (most recent call last):
File "/code/tensorrt-llm/TensorRT-LLM/benchmarks/cpp/prepare_dataset.py", line 109, in
cli()
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1685, in invoke
super().invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/code/tensorrt-llm/TensorRT-LLM/benchmarks/cpp/prepare_dataset.py", line 94, in cli
ctx.obj = RootArgs(tokenizer=kwargs['tokenizer'],
File "/usr/local/lib/python3.10/dist-packages/pydantic/main.py", line 171, in init
self.pydantic_validator.validate_python(data, self_instance=self)
File "/code/tensorrt-llm/TensorRT-LLM/benchmarks/cpp/prepare_dataset.py", line 44, in get_tokenizer
tokenizer.pad_token = tokenizer.eos_token
AttributeError: can't set attribute 'pad_token'

@byshiue
Copy link
Collaborator

byshiue commented May 15, 2024

You could change

        tokenizer.pad_token = tokenizer.eos_token

to

        if tokenizer.pad_token is None:
            tokenizer.pad_token = tokenizer.eos_token

We will fix it in next update.

@byshiue byshiue self-assigned this May 15, 2024
@byshiue byshiue added bug Something isn't working triaged Issue has been triaged by maintainers labels May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

2 participants