Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error while executing the Google Colab inference_nt notebook #66

Closed
amitpande74 opened this issue May 4, 2024 · 1 comment
Closed

Comments

@amitpande74
Copy link

I am trying to tailor the script for the Gene of my interest. Repeatedly, I am getting this error:
ValueError: Input length must be divisible by the 2 to the power of number of poolign layers.

I read your article as well, but nowhere I found the sequence length accepted by your model.
Could you kindly help?

@dallatt
Copy link
Contributor

dallatt commented May 24, 2024

Hello,

Because of the convolutions in the model, the input sequences are expected to respect this criterion. This means that a given sequence needs tohave a sequence length that is a multiple of 6, in order to be correctly tokenized by the 6-mer tokenizer (you can find more information about this in the README.md) and that each token corresponds to 6 nucleotides.
The tokenized sequence length must then respect the criterion of being a dividible by 4 because of the convolutional layers.

All in all, the sequence length inputted to the model needs to be dividible by 24 (6*4).

I hope this helps,
Hugo

@dallatt dallatt closed this as completed May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants