Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About CMD dataset #5

Open
liziming5353 opened this issue Apr 8, 2024 · 1 comment
Open

About CMD dataset #5

liziming5353 opened this issue Apr 8, 2024 · 1 comment

Comments

@liziming5353
Copy link

  1. The original video is mkv format. But your code use the format of image-frames. So do we need to preprocess the video first?
  2. Is the caption.json the subtitle of CMD dataset? I find that the content in caption.json does not match the videos.
@KerolosAtef
Copy link
Collaborator

Hello @liziming5353 sorry for late replay.
I have upgraded the CMD's data loader to support .mp4 videos (you can edit it to .mkv in your case in get_item function). Additionally, I have removed the caption.json file and adapted the code to handle subtitles.vtt files similar to the webvid and videochatgpt data loaders.
check these updates and let me know if you have any other questions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants