Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid NAL unit size #4

Open
xyleey opened this issue May 8, 2019 · 2 comments
Open

Invalid NAL unit size #4

xyleey opened this issue May 8, 2019 · 2 comments

Comments

@xyleey
Copy link

xyleey commented May 8, 2019

See 20190508袁雨桢Pocket live VOD, only available stream addr via 浪客:
http://lk-vod.lvb.eastmoney.com/4697b79evodcq1252033264/1c19f9625285890788896085733/playlist.m3u8?t=5eb58289&exper=0&us=0f10127396&sign=415428f9afe33ebb221e2b945bb1f24a

While using caterpillar to download it, either -m 0 (leave blank as default )or -m 1 bring many "Invalid NAL size unit" error as shown below:
image
File duration is exactly the same but if u play that local file u ill see frames stuck everywhere.

However if u stick the addr onto potplay and play it online, it goes very well. I've just tried ffmpeg to download this VOD using the same address via a really general command "ffmpeg -i ... -c copy xxx.ts". Then the file successfully plays as well.

Is it another kind of limitation of caterpillar and if so, any solution internally via caterpillar itself?

Thanks.
debug.txt

@zmwangx
Copy link
Owner

zmwangx commented May 9, 2019

I can reproduce everything you said (multiple FFmpeg versions, including 4.1, 3.4, 2.8).

I've just tried ffmpeg to download this VOD using the same address via a really general command "ffmpeg -i ... -c copy xxx.ts". Then the file successfully plays as well.

Yes, the key observation here is that saving to an MPEG-TS container is fine. If you next try to remux into an MP4 container in stream copy mode, you'll end up with a file that has the exact same problem.

I looked into the issue (FFmpeg reference [1]) and it's not clear what's causing it. A minimal base64-encoded sample is attached [2], and to reproduce the issue,

$ curl -L https://github.com/zmwangx/caterpillar/files/3159863/invalid-nal-unit-size-sample.ts.base64.txt | base64 -d >invalid-nal-unit-size-sample.ts
$ ffmpeg -i invalid-nal-unit-size-sample.ts -c copy invalid-nal-unit-size-sample.mp4
$ ffprobe invalid-nal-unit-size-sample.mp4
[h264 @ 0x7f9298017800] Invalid NAL unit size (0 > 1238).
[h264 @ 0x7f9298017800] Error splitting the input into NAL units.
[h264 @ 0x7f9298017800] Invalid NAL unit size (0 > 14).
[h264 @ 0x7f9298017800] Error splitting the input into NAL units.
...

I tried looking into the bitstream to no avail since H.264 is simply too complicated for untrained eyes fixated on a generic hex editor. The only comment I can make is that MPEG-TS, being a format explicitly designed for lossy transport, is more tolerant in many cases. Lacking a proper solution, probably the only thing I can do is to give more choices.

Funny thing is, I switched from TS to MP4 as the intermediate container format about a year ago, in 8751cc0, citing "This solves some weird problems." IIRC said weird problems were related to merging intermediate TS files into a final MP4; they were not about MP4 somehow more tolerant than TS (but it very well could be, in some other corner cases I've yet to encounter; multimedia encoding and delivery is messy on so many levels).

Anyway, what I'm going to do is to offer a new option, exposing the choice of intermediate container format. This way, you can at least get something playable with caterpillar out of this stream, when you use TS as the intermediate format. The final hop from TS to MP4, if necessary, is out of my hands, though.

[1] https://github.com/FFmpeg/FFmpeg/blob/48539b62fcbd994723a71cbc1750476cbee8eaac/libavcodec/h2645_parse.h#L112-L130
[2] https://github.com/zmwangx/caterpillar/files/3159863/invalid-nal-unit-size-sample.ts.base64.txt

@zmwangx zmwangx changed the title Invalid NAL size unit Invalid NAL unit size May 9, 2019
@zmwangx
Copy link
Owner

zmwangx commented May 9, 2019

Quick update: I was busy and didn't have time to work on this. In the meantime, here is a dirty patch (not final):

diff --git a/src/caterpillar/caterpillar.py b/src/caterpillar/caterpillar.py
index a010214..08455bc 100755
--- a/src/caterpillar/caterpillar.py
+++ b/src/caterpillar/caterpillar.py
@@ -287,6 +287,7 @@ def process_entry(
     wipe: bool = False,
     keep: bool = False,
     jobs: int = None,
+    intermediate_format: str = "mp4",
     concat_method: str = "concat_demuxer",
     retries: int = 0,
     progress: bool = True,
@@ -376,7 +377,10 @@ def process_entry(
             ):
                 raise RuntimeError("failed to download some segments")
             merge.incremental_merge(
-                local_m3u8_file, merge_dest, concat_method=concat_method
+                local_m3u8_file,
+                merge_dest,
+                concat_method=concat_method,
+                intermediate_format=intermediate_format,
             )
             if output != merge_dest:
                 try:
@@ -519,6 +523,16 @@ def main() -> int:
         action="store_true",
         help="overwrite the output file if it already exists",
     )
+    add(
+        "-I",
+        "--intermediate-format",
+        default="mp4",
+        help="""intermediate container format to use for coherent segments,
+        offered to overcome thorny issues with the default in certain edge
+        cases; should be an extension like 'mp4' or 'ts', and defaults to
+        'mp4'
+        """,
+    )
     add(
         "-j",
         "--jobs",
@@ -665,6 +679,7 @@ def main() -> int:
         wipe=args.wipe,
         keep=args.keep,
         jobs=args.jobs,
+        intermediate_format=args.intermediate_format,
         concat_method=args.concat_method,
         retries=args.retries,
         progress=progress,
diff --git a/src/caterpillar/merge.py b/src/caterpillar/merge.py
index df31960..2d57a24 100644
--- a/src/caterpillar/merge.py
+++ b/src/caterpillar/merge.py
@@ -165,7 +165,10 @@ def split_m3u8(
 # [1] https://ffmpeg.org/ffmpeg-all.html#concat-1
 # [2] https://ffmpeg.org/ffmpeg-all.html#concat-2
 def incremental_merge(
-    m3u8_file: pathlib.Path, output: pathlib.Path, concat_method: str = "concat_demuxer"
+    m3u8_file: pathlib.Path,
+    output: pathlib.Path,
+    concat_method: str = "concat_demuxer",
+    intermediate_format: str = "mp4",
 ):
     # Resolve output so that we don't write to a different relative path
     # later when we run FFmpeg from a different pwd.
@@ -179,7 +182,7 @@ def incremental_merge(
     intermediate_dir.mkdir(exist_ok=True)
 
     while True:
-        merge_dest = intermediate_dir / f"{playlist_index}.mp4"
+        merge_dest = intermediate_dir / f"{playlist_index}.{intermediate_format}"
         split_point = attempt_merge(playlist, merge_dest)
         if not split_point:
             break
@@ -193,7 +196,7 @@ def incremental_merge(
         if concat_method == "concat_demuxer":
             with open("concat.txt", "w", encoding="utf-8") as fp:
                 for index in range(1, playlist_index + 1):
-                    print(f"file {index}.mp4", file=fp)
+                    print(f"file {index}.{intermediate_format}", file=fp)
 
             command = [
                 "ffmpeg",
@@ -206,8 +209,8 @@ def incremental_merge(
                 "concat.txt",
                 "-c",
                 "copy",
-                "-bsf:a",
-                "aac_adtstoasc",
+                # "-bsf:a",
+                # "aac_adtstoasc",
                 "-movflags",
                 "faststart",
                 "-y",
@@ -215,7 +218,7 @@ def incremental_merge(
             ]
         elif concat_method == "concat_protocol":
             ffmpeg_input = "concat:" + "|".join(
-                f"{i}.mp4" for i in range(1, playlist_index + 1)
+                f"{i}.{intermediate_format}" for i in range(1, playlist_index + 1)
             )
             command = [
                 "ffmpeg",
@@ -226,8 +229,8 @@ def incremental_merge(
                 ffmpeg_input,
                 "-c",
                 "copy",
-                "-bsf:a",
-                "aac_adtstoasc",
+                # "-bsf:a",
+                # "aac_adtstoasc",
                 "-movflags",
                 "faststart",
                 "-y",

With this patch applied,

$ caterpillar --intermediate-format ts 'http://lk-vod.lvb.eastmoney.com/4697b79evodcq1252033264/1c19f9625285890788896085733/playlist.m3u8?t=5eb58289&exper=0&us=0f10127396&sign=415428f9afe33ebb221e2b945bb1f24a' output.ts

should give you a working MPEG-TS file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants