You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this function (loadvideo_decord), the function samples frames from the video using the clip length and the frame_sample rate.
The beginning of the clip is randomized. Lets say for simplicity that the first frame is 0.
Also, assume the clip length is 4 and the frame_sample_rate is 6.
I expect to get frames 0,6,12,18.
However, I get frames 0,8,16,24, which means the effective frame_sample_rate is 8!
This also happens for the more "conventional" example of frame_sample_rate = 4 and clip_len=16, as used in the script for vit_large.
Here, np.diff(index) returns array([4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4]), because the code attempts to get 16 frames from a range of 64 frames; whereas it should really get it from 60 frames.
I suggest fixing this by changing the line converted_len = int(self.clip_len * self.frame_sample_rate)
to converted_len = int((self.clip_len-1) * self.frame_sample_rate)
This is at the very core of VideoMAE. Please correct me if I'm wrong or misunderstood something.
The text was updated successfully, but these errors were encountered:
In this function (loadvideo_decord), the function samples frames from the video using the clip length and the frame_sample rate.
The beginning of the clip is randomized. Lets say for simplicity that the first frame is 0.
Also, assume the clip length is 4 and the frame_sample_rate is 6.
I expect to get frames 0,6,12,18.
However, I get frames 0,8,16,24, which means the effective frame_sample_rate is 8!
VideoMAE/kinetics.py
Line 222 in 14ef8d8
This also happens for the more "conventional" example of frame_sample_rate = 4 and clip_len=16, as used in the script for vit_large.
Here,
np.diff(index)
returnsarray([4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4])
, because the code attempts to get 16 frames from a range of 64 frames; whereas it should really get it from 60 frames.I suggest fixing this by changing the line
converted_len = int(self.clip_len * self.frame_sample_rate)
to
converted_len = int((self.clip_len-1) * self.frame_sample_rate)
This is at the very core of VideoMAE. Please correct me if I'm wrong or misunderstood something.
The text was updated successfully, but these errors were encountered: