We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
File "/home/dperi/.pyenv/versions/3.11.7/lib/python3.11/site-packages/torch/functional.py", line 126, in broadcast_shapes raise RuntimeError(f"Trying to create tensor with negative dimension ({shape[i]}): ({shape[i]})") RuntimeError: Trying to create tensor with negative dimension (-1): (-1) While executing %where : [num_users=1] = call_function[target=torch.ops.aten.where.self](args = (%slice_4, %div, %_frozen_param2), kwargs = {_itensor_to_tensor_meta: {<tensorrt.tensorrt.ITensor object at 0x7f4f65252730>: ((1, s0), torch.int64, False, (s0, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4f64a51370>: ((1, s0), torch.int64, False, (s0, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4f64aea3b0>: None, <tensorrt.tensorrt.ITensor object at 0x7f4f65389b70>: None, <tensorrt.tensorrt.ITensor object at 0x7f4f64d29330>: ((s0,), torch.int64, False, (1,)
Using https://github.com/pytorch/TensorRT/tree/dyn_llama branch
from transformers import AutoModelForCausalLM, AutoTokenizer import torch import torch_tensorrt torch_device = "cuda" if torch.cuda.is_available() else "cpu" tokenizer = AutoTokenizer.from_pretrained("gpt2") # add the EOS token as PAD token to avoid warnings model = AutoModelForCausalLM.from_pretrained("gpt2", pad_token_id=tokenizer.eos_token_id).to(torch_device) model_inputs = tokenizer('I enjoy walking with my cute dog', return_tensors='pt').to(torch_device) inputs = model_inputs['input_ids'] pyt_outputs = model(inputs) torch._dynamo.mark_dynamic(model_inputs['input_ids'], 1, min=2, max=1024) model.forward = torch.compile(model.forward, backend="tensorrt", dynamic=None, options={"truncate_long_and_double": True, "debug": True, "enabled_precisions": {torch.float}, "min_block_size": 1,}) trt_outputs = model(inputs) print("Difference: ", torch.sum(pyt_outputs[0]-trt_outputs[0])) # encode context the generation is conditioned on breakpoint() # generate 40 new tokens greedy_output = model.generate(**model_inputs, max_new_tokens=40) print("Output:\n" + 100 * '-') print(tokenizer.decode(greedy_output[0], skip_special_tokens=True))
Steps to reproduce the behavior:
Build information about Torch-TensorRT can be found by turning on debug messages
conda
pip
libtorch
The text was updated successfully, but these errors were encountered:
chohk88
apbose
No branches or pull requests
Bug Description
To Reproduce
Using https://github.com/pytorch/TensorRT/tree/dyn_llama branch
Steps to reproduce the behavior:
Expected behavior
Environment
conda
,pip
,libtorch
, source):Additional context
The text was updated successfully, but these errors were encountered: