New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Size of Tensor A must match size of Tensor B #1540
Comments
Can you provide more of the stack trace? My guess is that attention sinks in transformers is not bug free. Separately, I recommend using Mixtral through vLLM in general. Likely it will be hard to make Mixtral run for long sequences, and it already supports 32k total input+output. |
Noted. I will try again with removing --max_seq_len=4096 Also where are the error log files saved in the H2O-GPT folder? (so that i can send you the stack trace) Does one get the CLI output dumped on a file using "> /home/user/dump" after the CLI startup script? |
I tried using Mixtral with vLLM and did the following:
|
Issue: Unable to connect to the inference server: After starting the inference server if I do the curl test to connect to it I get the connection refused error at the port. The Gradio Dump: Using Model mistralai/mixtral-8x7b-instruct-v0.1 The above exception was the direct cause of the following exception: Traceback (most recent call last): The above exception was the direct cause of the following exception: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): HF Client End: http://http: mistralai/Mixtral-8x7B-Instruct-v0.1 : None To create a public link, set share=True in launch(). |
If you look at the trace, you have an odd "Begin: http:://127.0.0.1:8080" with extra :. As in the docs, with vLLM one would do something like |
HI,
I am trying to do RAG query on a large PDF file and get the below error:
Error: The size of tensor a (3351) must match the size of tensor b (4096) at non-singleton dimension 3.
The run script: python generate.py --base_model=mistralai/Mixtral-8x7B-Instruct-v0.1 --pre_load_embedding_model=True --score_model=None --enable_tts=False --enable_stt=False --enable_transcriptions=False --auth=auth.json --system_prompt="My name is H2O-GPT and I am an intelligent AI" --attention_sinks=True --max_new_tokens=100000 --max_max_new_tokens=100000 --top_k_docs=-1 --use_gpu_id=False --max_seq_len=4096 --sink_dict="{'num_sink_tokens': 4, 'window_length': 4096}"
The text was updated successfully, but these errors were encountered: