Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new option to generate subtitles by a specific number of words #548

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

amolinasalazar
Copy link
Contributor

Added a new word option called --max_words_count that will generate subtitles setting a maximum limit of words per segment.

I've opened the same PR in Whisper: https://github.com/openai/whisper/pull/1729 with some slightly changes:

  • All options are compatible: "highlight_words", "max_line_count", "max_line_width" and "max_words_count".
  • Tested with/without diarization.
  • Please pay attention to the changes when I handle "times" in iterate_subtitles() on util.py file. I've wanted to minimize the impact on the current behavior, but I changed the times we yield so now they match the chunk of words set instead of the original segment.

The rest is just a copy/paste of the original PR description:

Added a new word option called --max_words_count that will generate subtitles setting a maximum limit of words per segment. This could sound similar to --max_line_width option, but the results are more pleasent for readers IMHO. Here a couple of comparisons using .SRT files:

max_word_count
Notice that --max_words_count works as an upper bound of words, but still it will respect the segments in the way that end of sentences can have less words if the remaining number of words in a segment is lower than the max_words_count value.
i.e. Segment = [word1, word2, word3, word4, word5] and max_words_count = 3
=>Result = [word1, word2, word3] and [word4, word5]

This is not the behaviour we can see using --max_line_width that can leave bigger gaps of time when joining end and beginning of segments:
WidthvsWords

Subtitles generated with --max_words_count look similar of what we can see in Shorts, Reels and other short duration videos.

This is my first contribution, so feel free of changing/comment/improve anything.

Additional notes

  • Manually tested using Python and cli and checked results in .srt and .vtt files (.txt. and .tsv files won't be affected).

This is my first contribution, so feel free of changing/comment/improve anything.

@amolinasalazar
Copy link
Contributor Author

The main PR got merged in Whisper: https://github.com/openai/whisper/pull/1729 I'll apply the latest changes.

@amolinasalazar
Copy link
Contributor Author

Merged conflicts and bringed the new naming from Whisper:

  • RENAME from max_words_count to max_words_per_line: 6ba8d6a

Again, feel free to review/test/change anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant