Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If I input more than the max_seq_length? #40

Open
runwean opened this issue Aug 9, 2023 · 4 comments
Open

If I input more than the max_seq_length? #40

runwean opened this issue Aug 9, 2023 · 4 comments

Comments

@runwean
Copy link

runwean commented Aug 9, 2023

I see that sgpt-bloom-7b1-mamarco model has a vector length of 300,but

If I input more than the maximum length, for example, input more than 400 Chinese characters, it seems that it can also be embedded in the vector, but it seems that the increase to more than 500 will not affect the vector calculation results.

Can I enter a maximum Chinese character of 500?

@Muennighoff
Copy link
Owner

Yes you can input more characters. The calculation may not be affected because you need to change the max_sequence_length - Check this issue: #23 (comment)

If it still does not work, please provide the exact code you are using.

@runwean
Copy link
Author

runwean commented Aug 9, 2023

Thanks for the reply.
My understanding is that because the model is trained with 300 tokens, if we change the input length, for example, to 500, the effect may be similar, but if the increase is larger, it is not possible to have a bad effect, because the training sample is not so long 🤔

@Muennighoff
Copy link
Owner

Thanks for the reply. My understanding is that because the model is trained with 300 tokens, if we change the input length, for example, to 500, the effect may be similar, but if the increase is larger, it is not possible to have a bad effect, because the training sample is not so long 🤔

Yeah, it'd be really interesting to know how performance is at longer sequences. If you run any experiments and have any data on how it performs, would be amazing if you could share it 🚀

@runwean runwean closed this as completed Aug 9, 2023
@runwean runwean reopened this Aug 9, 2023
@runwean
Copy link
Author

runwean commented Aug 9, 2023

Thanks for the reply. My understanding is that because the model is trained with 300 tokens, if we change the input length, for example, to 500, the effect may be similar, but if the increase is larger, it is not possible to have a bad effect, because the training sample is not so long 🤔

Yeah, it'd be really interesting to know how performance is at longer sequences. If you run any experiments and have any data on how it performs, would be amazing if you could share it 🚀

Thanks you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants