Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with ResnetCRNN_varlen #39

Open
glmanhtu opened this issue Nov 30, 2020 · 0 comments
Open

Problem with ResnetCRNN_varlen #39

glmanhtu opened this issue Nov 30, 2020 · 0 comments

Comments

@glmanhtu
Copy link

Hello @HHTseng,

I find it really interesting in your code for better understanding both 3D and CNN + LSTM architecture. However, I think there is a small problem when you handling various lengths in the LSTM part. As we have some videos with minimum of 28 frames and you have padded it to make sure they are all have 50 frames. However, when you decode the LSTM hidden units, you take the last frame: https://github.com/HHTseng/video-classification/blob/master/ResNetCRNN_varylength/functions.py#L276
which will be zeros in these cases.

I think we have to rely on the second output of torch.nn.utils.rnn.pad_packed_sequence to decide which timestep to decode for classification,

Please let me know your opinion,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant