Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QUESTION: Can't get MCC output to work with known good input #1542

Open
7 tasks done
PyCoder040 opened this issue Jun 7, 2023 · 1 comment
Open
7 tasks done

QUESTION: Can't get MCC output to work with known good input #1542

PyCoder040 opened this issue Jun 7, 2023 · 1 comment

Comments

@PyCoder040
Copy link

PyCoder040 commented Jun 7, 2023

CCExtractor detailed version info
Version: 0.94
Git commit: 5b76669
Compilation date: 2023-06-07
CEA-708 decoder: Rust
File SHA256: Could not open file

Libraries used by CCExtractor
Tesseract Version: 5.2.0
Leptonica Version: leptonica-1.82.0
libGPAC Version: 1.0.1
zlib: 1.2.11
utf8proc Version: 2.4.0
protobuf-c Version: 1.3.1
libpng Version: 1.6.37
FreeType
libhash
nuklear
libzvbi

In raising this issue, I confirm the following:

  • I have read and understood the contributors guide.
  • I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
  • I have checked that the issue I'm posting isn't already reported.
  • I have checked that the issue I'm porting isn't already solved and no duplicates exist in closed issues and in opened issues
  • I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
  • I have used the latest available version of CCExtractor to verify this issue exists.
  • I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text.

Necessary information

  • Is this a regression (i.e. did it work before)? {notsure}
  • What platform did you use? {Linux}
  • What were the used arguments? ccextractor -debug -in=raw -out=mcc 3.bin -o 4.mcc

So testing my input I get valid captions

[user@p5810r hyperdeck]# ccextractor -debug -in=raw 3.bin -stdout

CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: Files (1): 3.bin
[Extract: 1] [Stream mode: McPoodle's raw]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[CEA-708: 63 decoders active]
[CEA-708: using charset "none" for all services]
[Timing mode: Auto] [Debug: Yes] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]

-----------------------------------------------------------------
Opening file: 3.bin
Analyzing data in McPoodle raw mode
Sending captions to stdout.
1
00:00:03,404 --> 00:00:06,171
 In this lesson, we're going to 
 be talking about finance. And  

2
00:00:06,173 --> 00:00:10,009
    one of the most important   
             aspects            
    of finance is interest.     



Total frames time:	  00:00:00:000  (0 frames at 29.97fps)

Min PTS:				00:00:00:001
Max PTS:				00:00:10:111
Length:				 00:00:10:110
Done, processing time = 0 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

But when I try to switch the output format to MCC, I get a fatal error message:

[user@p5810r hyperdeck]# ccextractor -debug -in=raw -out=mcc 3.bin -o 4.mcc 
CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: Files (1): 3.bin
[Extract: 1] [Stream mode: McPoodle's raw]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[CEA-708: 63 decoders active]
[CEA-708: using charset "none" for all services]
[Timing mode: Auto] [Debug: Yes] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .mcc] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]

-----------------------------------------------------------------
Opening file: 3.bin
Analyzing data in McPoodle raw mode
Output format not supported
Output format not supported


Total frames time:	  00:00:00:000  (0 frames at 29.97fps)

Min PTS:				00:00:00:001
Max PTS:				00:00:10:111
Length:				 00:00:10:110
Done, processing time = 0 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

After reviewing the code (ccx_encoders_common.c), I see that this error can be generated when a 608 caption is to be stored in a MCC format. This leads me to believe that I can output MCC files as long as the input is not considered to be 608. I don't see where I can declare my payload to be non-608. So I think I'm just doing this wrong. Do I need to provide 708, or can I tell ccextractor to "convert" to 708?

My input file was created by using subrip2scc.pl to convert a SRT into SCC. Then I used scc2raw.pl to convert the SCC into a McPoodle RAW file. I think that maybe this is creating a standard definition (608) caption file? Anyway, that's what I fed into ccextractor as shown above.

If I'm unknowingly making a 608-only file, this may be where I'm going wrong. I need 708 captions to feed into ccextractor to get a MCC file.

Is it possible for someone to post a working example of a commandline producing a valid MCC output file?

P.S. I have also tried this with the 0.94 release version. No difference.


As a second test, I took an entirely different approach. I injected a SRT into a MP4 using ffmpeg. VLC will show these captions all day long. But when fed to ccextractor, the captions are seen if specified in -stdout, but if you add the MCC output, you get a zero length mcc file.

ffmpeg -i HyperDeck_0002.mp4 -t 15 -i 0.srt -t 15 -c:v libx264 -profile:v main -b:v 53000k -pix_fmt yuv420p -c:s mov_text -metadata:s:s:0 language=eng HyperDeck_0004srt.mp4


[user@p5810r hyperdeck]# ccextractor -debug HyperDeck_0004srt.mp4 -out=mcc -o HyperDeck_0004srt.mcc
CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: Files (1): HyperDeck_0004srt.mp4
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[CEA-708: 63 decoders active]
[CEA-708: using charset "none" for all services]
[Timing mode: Auto] [Debug: Yes] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .mcc] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]

-----------------------------------------------------------------
Opening file: HyperDeck_0004srt.mp4
Detected MP4 box with name: ftyp
Detected MP4 box with name: free
Detected MP4 box with name: mdat
File seems to be a MP4
Analyzing data with GPAC (MP4 library)
Opening 'HyperDeck_0004srt.mp4': ok
Track 1, type=vide subtype=avc1
Track 2, type=soun subtype=MPEG
Track 3, type=sbtl subtype=tx3g
MP4: found 3 tracks: 1 avc and 1 cc
Processing track 1, type=vide subtype=avc1
Processing track 2, type=soun subtype=MPEG
Processing track 3, type=sbtl subtype=tx3g
100%  |  00:17
Closing media: ok
Found 1 AVC track(s). Found 1 CC track(s).


Total frames time:        00:00:00:000  (0 frames at 29.97fps)

Min PTS:                                00:00:00:000
Max PTS:                                00:00:17:720
Length:                          00:00:17:720
Done, processing time = 0 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

@PyCoder040
Copy link
Author

PyCoder040 commented Jun 7, 2023

UPDATE (much closer, but still not working)

Using this https://github.com/szatmary/libcaption which injects 608/708 into FLV files, I was able to get a non-zero-length MCC file from the same source files as above. I'm yet to figure out why though. The file looks valid, but when you play it back the captions are present but the timing is shot. All captions stay up for exactly one frame then move on to the next one. So unless you are an incredibly fast speed reader, you're out of luck. BTW, VLC plays 2.flv and 3.mp4 just fine and the captions are shown at the correct speed.

ffmpeg -y -i HyperDeck_0002.mp4 -codec copy -f flv 1.flv
./flv+srt 1.flv HyperDeck_0002.srt > 2.flv
ffmpeg -y -i 2.flv -codec copy 3.mp4
ccextractor 3.mp4 -out=mcc -o 4.mcc

Improper high speed playback can be confirmed by having ffmpeg convert the MCC to SRT. The srt shows all times at just a counter+33 milliseconds (all captions are just output on the very next frame of video). In all cases, simply changing -out=srt proves the input files are being read, and all timing is being understood. Simply MCC output seems to be broken.

ffmpeg -i HyperDeck_0004scc.mcc zzz.srt

1
00:00:01,104 --> 00:00:03,379
after the off on this event.

2
00:00:03,390 --> 00:00:03,845
Okay. and.

3
00:00:03,856 --> 00:00:05,934
>> But unsolicited Talk about

4
00:00:05,968 --> 00:00:07,349
passwords are hey, summer is


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant