Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] or [QUESTION] : Hardsubx didn't extract burn-in subs exactly as expected #1411

Open
7 tasks done
brebetez opened this issue Feb 1, 2022 · 2 comments
Open
7 tasks done
Assignees
Labels
HardsubX Issues related to extraction of HardsubX

Comments

@brebetez
Copy link

brebetez commented Feb 1, 2022

Please prefix your issue with one of the following: [BUG], [QUESTION].

CCExtractor version: 0.94
CCExtractor detailed version info
Git commit: 290e2f1
Compilation date: 2021-12-27
Libraries used by CCExtractor
Tesseract Version: 4.1.1
Leptonica Version: leptonica-1.79.0
libGPAC Version: 1.0.1
zlib: 1.2.11
utf8proc Version: 2.4.0
protobuf-c Version: 1.3.1
libpng Version: 1.6.37
FreeType
libhash
nuklear
libzvbi

In raising this issue, I confirm the following:

  • I have read and understood the contributors guide.
  • I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
  • I have checked that the issue I'm posting isn't already reported.
  • I have checked that the issue I'm porting isn't already solved and no duplicates exist in closed issues and in opened issues
  • I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
  • I have used the latest available version of CCExtractor to verify this issue exists.
  • I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text.

Necessary information

  • Is this a regression (i.e. did it work before)? {DON'T KNOW}
  • What platform did you use? {Linux}
  • What were the used arguments? myVideo.mp4 -ocrlang fra -hardsubx -ocr_mode frame -subcolor white -min_sub_duration 0.01 -detect_italics -whiteness_thresh 97 -conf_thresh 75

Video links

  • {you can ask for an exemple if needed, run the tool on 24 videos with some differants look and feel of burn in subs...all the time the same issues.}

Additional information

{I have several issue:

  • Subtitle output didn't fit on a time base with the burn in subtitle. Sometimes, the subtitles start a little bit (between 2sec until a few frame) before as the burn in subtitle. I tried to change the parameter of OCR_MODE, but no change on this delay.
  • Subtitle duration of the output: All subtitle extracted has a a fix duration of 1 or 2 seconds. But nothing in between, more or less. That means, the subtitle disappears before that the burn in subtitle disappear.
  • Burn in subtitle on 2 lines: Inside the output, mostly, only the second line a recognized. Or it create 2 or 3 subtitles with always something more inside the subtitle...but online on a one line, not exactly as it is in burn in subs.

I don't know if it is an issue or if I didn't use correctly the parameters, but as I said, I tried a lot of different ways...with all the time the same result. That is why, I'm opening an Issue.

Thanks for your feedback and help.}

@PunitLodha PunitLodha added the HardsubX Issues related to extraction of HardsubX label Feb 6, 2022
@PunitLodha
Copy link
Member

Please share the videos, so that we could look into this issue

@shashwat1002
Copy link
Contributor

shashwat1002 commented Jun 2, 2022

On investigating, at least on the files I have there are subtitles extracted with duration less than a second.
So I guess the second point is not entirely general, either it has changed since the issue or it's an artefact of the files used.

@brebetez please consider sharing the file you used

cc: @PunitLodha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HardsubX Issues related to extraction of HardsubX
Projects
None yet
Development

No branches or pull requests

3 participants