Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

certain lengthy youtube videos take a long time to initialise #32692

Closed
3 of 5 tasks
wellsyw opened this issue Jan 13, 2024 · 13 comments
Closed
3 of 5 tasks

certain lengthy youtube videos take a long time to initialise #32692

wellsyw opened this issue Jan 13, 2024 · 13 comments
Labels

Comments

@wellsyw
Copy link

wellsyw commented Jan 13, 2024

Checklist

  • I'm reporting a broken site support
  • I've verified that I'm running youtube-dl version 2021.12.17
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar issues including closed ones

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', '--simulate', '8SIiGo3TVKE']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python 3.10.13 (PyPy version 7.3.14 AMD64 64bit) - Windows-8.1-6.3.9600-
SP0 - OpenSSL 1.1.1t  7 Feb 2023
[debug] exe versions: none
[debug] Proxy map: {}
[youtube] 8SIiGo3TVKE: Downloading webpage
[debug] [youtube] Decrypted nsig 45tbpgX-M8RD3WRrq => JiFeLd6kfppZFg
[debug] [youtube] Decrypted nsig t9SpsqJ7CqdW7NVV0 => hVl3_KSCH26XkA
[debug] Default format spec: bestvideo+bestaudio/best

Description

Most youtube videos take about 13 seconds of runtime for me, with --simulate. Some videos take 20 or 28 seconds. Other videos take 80 seconds. This depends on the video and is reproducible.

Sample video with such behaviour: 8SIiGo3TVKE

youtube-dl pauses for a looong time after:
[debug] [youtube] Decrypted nsig 45tbpgX-M8RD3WRrq => JiFeLd6kfppZFg

CPU is pegged during the pause and the total runtime is 80 seconds (!). Memory use is high as well: it rises slowly to ~218 MB (as opposed to ~50 MB normally)

The download works, of course, but this time it spends on initialisation is excessive. As far as I can tell, it is not caused by the n-sig decryption, but something that happens in between.

The only thing I've found in common with these videos is that they're long and have HD video (1080p or more).

Above verbose output was taken with PyPy 3.10, and it also happens with PyPy 2.7 as well as CPython 2.7.

So if somebody feels like profiling some code, there's something to profile here...

@dirkf
Copy link
Contributor

dirkf commented Jan 14, 2024

With this low-spec 2-CPU notebook, the extraction for 8SIiGo3TVKE takes ~210s with one CPU close to maxed and VSZ up to ~250MB. With no n-sig processing yt-dl (Py2.7) is faster than yt-dlp (Py 3.11) at < 11s.

A trivial trace of the n-sig processing seems to be showing an increasing delay between applying the cached n-sig to each format.

We should unthrottle the DASH formats before build_fragments(): then 210s becomes 30s. "Bad programmer, BAD!"

dirkf added a commit to dirkf/youtube-dl that referenced this issue Jan 15, 2024
* apply n-sig before chunked fragments, fixes ytdl-org#32692
dirkf added a commit to dirkf/youtube-dl that referenced this issue Jan 15, 2024
* apply n-sig before chunked fragments, fixes ytdl-org#32692
@dirkf
Copy link
Contributor

dirkf commented Jan 16, 2024

With PR #32695, incorporating the above commit:
yt-dl/Py2.7 ~5.5s
yt-dl/Py3.11 <4s
yt-dlp/Py3.11 ~7s

@dirkf
Copy link
Contributor

dirkf commented Jan 18, 2024

Thanks for the test. Yes indeed, it's much faster when it doesn't actually calculate the n-sig. Now I'm getting something like these figures:
yt-dlp --extractor-args 'youtube:player-client=web' -> ~8s
yt-dl/Py2.7 -> ~22s (still faster than above)
yt-dl/Py3.11 -> ~13s.

I guess there is some shim/shims making 2.7 so much slower.

dirkf added a commit to dirkf/youtube-dl that referenced this issue Jan 20, 2024
* apply n-sig before chunked fragments, fixes ytdl-org#32692
dirkf added a commit to dirkf/youtube-dl that referenced this issue Jan 20, 2024
* apply n-sig before chunked fragments, fixes ytdl-org#32692
dirkf added a commit to dirkf/youtube-dl that referenced this issue Jan 20, 2024
* apply n-sig before chunked fragments, fixes ytdl-org#32692
dirkf added a commit to dirkf/youtube-dl that referenced this issue Jan 20, 2024
* apply n-sig before chunked fragments, fixes ytdl-org#32692
@dirkf
Copy link
Contributor

dirkf commented Jan 21, 2024

You could test the fixed PR #32695. This version seems to extract from YT at least as fast as yt-dlp with Python 3.11 but a miniconda 2.7.15 is up to 3x slower.

@dirkf
Copy link
Contributor

dirkf commented Jan 21, 2024

Similarly with a distro that provides 2.7.18 (but a beefier laptop):
yt-dlp --extractor-args 'youtube:player-client=web'/Py3.9 -> ~5s
yt-dl/Py2.7 -> ~6s
yt-dl/Py3.9 -> ~4s

Now I would say that other factors than the extractor and jsinterp code (eg, Python build, Celeron vs Core 2 vs whatever) are dominating.

@3052
Copy link

3052 commented Jan 21, 2024

does YouTube-DL have an option to change the client? using the ANDROID client, I am getting this:

TotalMilliseconds : 227.3747

most of which is probably just spent on waiting for the JSON response from the server

@dirkf
Copy link
Contributor

dirkf commented Jan 21, 2024

You can choose the client with yt-dlp using --extractor-args 'youtube:player-client={client}' (as shown above). Choosing android vs web on this system gives ~2s vs ~4s in the same case as above. As well as interrogating the server, the result has to be interpreted, the info-json generated, and the format(s) selected; and then in the test, where I'm using -g to check that the n parameter was processed properly) the media URL(s) have to be displayed.

@3052
Copy link

3052 commented Jan 21, 2024

if you're using the ANDROID client, you dont need to check the n parameter, at least thats my understanding.

@dirkf
Copy link
Contributor

dirkf commented Jan 21, 2024

That's right, but I only want to vary one parameter at a time.

yt-dl will use a different client for age-gate bypass, but otherwise 'web' is hard-wired.

@3052
Copy link

3052 commented Jan 21, 2024

not using ANDROID client, comes with an extremely significant drawback, as you well know. so by adding this arbitrary restriction on the client used by the tool, you are drawing yourself into a corner. As long as you understand that fact, nothing more I can add I suppose.

@dirkf
Copy link
Contributor

dirkf commented Jan 21, 2024

The rationale for selecting the web client, as I understand it, is that a wider range of formats are available.

There's nothing to stop someone who cares from offering a PR/PR(s) to back-port (or reimplement) the --extractor-args ... feature from yt-dlp and then to support selection of the YT client in the YT extractor.

@dirkf dirkf closed this as completed in f8b0135 Jan 22, 2024
github-actions bot added a commit to hellopony/youtube-dl that referenced this issue Jan 22, 2024
* https://github.com/ytdl-org/youtube-dl:
  [YouTube] Fix `like_count` extraction using `likeButtonViewModel` * also fix various tests * TODO: check against yt-dlp tests
  [YouTube] Rework n-sig processing, realigning with yt-dlp * apply n-sig before chunked fragments, fixes ytdl-org#32692
  [InfoExtractor] Support some warning and `._downloader` shortcut methods from yt-dlp
  [compat] Rework compat for `method` parameter of `compat_urllib_request.Request` constructor * fixes ytdl-org#32573 * does not break `utils.HEADrequest` (eg)
@wellsyw
Copy link
Author

wellsyw commented Jan 23, 2024

Appears to take 9-10 seconds regardless of video now, so that's a good improvement.

@3052
Copy link

3052 commented Jan 23, 2024

LOL

@dirkf dirkf added the fixed label Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants