Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Incorrect temporal indexing? #97

Open
rosenfeldamir opened this issue May 1, 2023 · 0 comments
Open

BUG: Incorrect temporal indexing? #97

rosenfeldamir opened this issue May 1, 2023 · 0 comments

Comments

@rosenfeldamir
Copy link

rosenfeldamir commented May 1, 2023

In this function (loadvideo_decord), the function samples frames from the video using the clip length and the frame_sample rate.
The beginning of the clip is randomized. Lets say for simplicity that the first frame is 0.
Also, assume the clip length is 4 and the frame_sample_rate is 6.
I expect to get frames 0,6,12,18.
However, I get frames 0,8,16,24, which means the effective frame_sample_rate is 8!

def loadvideo_decord(self, sample, sample_rate_scale=1):

This also happens for the more "conventional" example of frame_sample_rate = 4 and clip_len=16, as used in the script for vit_large.

Here, np.diff(index) returns array([4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4]), because the code attempts to get 16 frames from a range of 64 frames; whereas it should really get it from 60 frames.
I suggest fixing this by changing the line
converted_len = int(self.clip_len * self.frame_sample_rate)
to converted_len = int((self.clip_len-1) * self.frame_sample_rate)
This is at the very core of VideoMAE. Please correct me if I'm wrong or misunderstood something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant