Problem with ResnetCRNN_varlen #39

glmanhtu · 2020-11-30T08:52:17Z

I find it really interesting in your code for better understanding both 3D and CNN + LSTM architecture. However, I think there is a small problem when you handling various lengths in the LSTM part. As we have some videos with minimum of 28 frames and you have padded it to make sure they are all have 50 frames. However, when you decode the LSTM hidden units, you take the last frame: https://github.com/HHTseng/video-classification/blob/master/ResNetCRNN_varylength/functions.py#L276
which will be zeros in these cases.

I think we have to rely on the second output of torch.nn.utils.rnn.pad_packed_sequence to decide which timestep to decode for classification,

Please let me know your opinion,

glmanhtu mentioned this issue Dec 7, 2020

Fix the unpacking various sequence LSTM #40

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with ResnetCRNN_varlen #39

Problem with ResnetCRNN_varlen #39

glmanhtu commented Nov 30, 2020

Problem with ResnetCRNN_varlen #39

Problem with ResnetCRNN_varlen #39

Comments

glmanhtu commented Nov 30, 2020