Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PornHub] Rename the age verification cookie #31916

Closed
wants to merge 8 commits into from
Closed

[PornHub] Rename the age verification cookie #31916

wants to merge 8 commits into from

Conversation

arobase-che
Copy link
Contributor

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

It only rename the legal age verification cookie.

@dirkf
Copy link
Contributor

dirkf commented Mar 25, 2023

Does this solve an actual issue? I assume so, so there should either be a new or modified test for it, or just post verbose logs showing before and after.

@arobase-che
Copy link
Contributor Author

Sorry, I didn't see your comment.
Yes it does solve an issue. The age verification cookie is now site dependent.
I couldn't try on the premium version but I just fix it for an alternative website supported by this extractor.

Here is the verbose output for 2 videos, with and without this path:

1 - Without:

$ youtube-dl --verbose "https://www.thumbzilla.com/video/ph62ac7b29e5a39/a-girl-s-perspective-part-2-trailer"
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.thumbzilla.com/video/ph62ac7b29e5a39/a-girl-s-perspective-part-2-trailer']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.10.10 (CPython) - Linux-6.2.8-arch1-1-x86_64-with-glibc2.37
[debug] exe versions: ffmpeg 6.0, ffprobe 6.0, rtmpdump 2.4
[debug] Proxy map: {}
[PornHub] ph62ac7b29e5a39: Downloading pc webpage
[PornHub] ph62ac7b29e5a39: Downloading tv webpage
ERROR: Unable to extract encoded url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/youtube_dl/YoutubeDL.py", line 819, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/lib/python3.10/site-packages/youtube_dl/YoutubeDL.py", line 840, in __extract_info
    ie_result = ie.extract(url)
  File "/usr/lib/python3.10/site-packages/youtube_dl/extractor/common.py", line 535, in extract
    ie_result = self._real_extract(url)
  File "/usr/lib/python3.10/site-packages/youtube_dl/extractor/pornhub.py", line 400, in _real_extract
    js_vars = extract_js_vars(
  File "/usr/lib/python3.10/site-packages/youtube_dl/extractor/pornhub.py", line 337, in extract_js_vars
    assignments = self._search_regex(
  File "/usr/lib/python3.10/site-packages/youtube_dl/extractor/common.py", line 1013, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract encoded url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

1 - With :

$ py youtube_dl/__main__.py --verbose "https://www.thumbzilla.com/video/ph62ac7b29e5a39/a-girl-s-perspective-part-2-trailer"
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.thumbzilla.com/video/ph62ac7b29e5a39/a-girl-s-perspective-part-2-trailer']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: 0b786108b
[debug] Python version 3.10.10 (CPython) - Linux-6.2.8-arch1-1-x86_64-with-glibc2.37
[debug] exe versions: ffmpeg 6.0, ffprobe 6.0, rtmpdump 2.4
[debug] Proxy map: {}
[PornHub] ph62ac7b29e5a39: Downloading pc webpage
[PornHub] ph62ac7b29e5a39: Downloading m3u8 information
[PornHub] ph62ac7b29e5a39: Downloading m3u8 information
[PornHub] ph62ac7b29e5a39: Downloading m3u8 information
[PornHub] ph62ac7b29e5a39: Downloading m3u8 information
[PornHub] ph62ac7b29e5a39: Downloading JSON metadata
WARNING: unable to extract view count; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://ev-h.phncdn.com/hls/videos/202206/17/410097631/,1080P_4000K,720P_4000K,480P_2000K,240P_1000K,_410097631.mp4.urlset/index-f1-v1-a1.m3u8?validfrom=1680058002&validto=1680065202&ipa=86.202.209.61&hdl=-1&hash=7vVYn8Prm%2FHCIzP2CN1Q89mZa78%3D'
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 10
[download] Destination: A Girl's Perspective Part 2 - TRAILER-ph62ac7b29e5a39.mp4
[download] 100% of 5.90MiB in 00:01
[debug] ffmpeg command line: ffprobe -show_streams 'file:A Girl'"'"'s Perspective Part 2 - TRAILER-ph62ac7b29e5a39.mp4'
[ffmpeg] Fixing malformed AAC bitstream in "A Girl's Perspective Part 2 - TRAILER-ph62ac7b29e5a39.mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel repeat+info -i 'file:A Girl'"'"'s Perspective Part 2 - TRAILER-ph62ac7b29e5a39.mp4' -c copy -f mp4 -bsf:a aac_adtstoasc 'file:A Girl'"'"'s Perspective Part 2 - TRAILER-ph62ac7b29e5a39.temp.mp4'

2 - Without

$ youtube-dl --verbose "https://www.pornhub.com/view_video.php?viewkey=ph61a74c98caef3"
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.pornhub.com/view_video.php?viewkey=ph61a74c98caef3']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.10.10 (CPython) - Linux-6.2.8-arch1-1-x86_64-with-glibc2.37
[debug] exe versions: ffmpeg 6.0, ffprobe 6.0, rtmpdump 2.4
[debug] Proxy map: {}
[PornHub] ph61a74c98caef3: Downloading pc webpage
[PornHub] ph61a74c98caef3: Downloading tv webpage
ERROR: Unable to extract encoded url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/youtube_dl/YoutubeDL.py", line 819, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/lib/python3.10/site-packages/youtube_dl/YoutubeDL.py", line 840, in __extract_info
    ie_result = ie.extract(url)
  File "/usr/lib/python3.10/site-packages/youtube_dl/extractor/common.py", line 535, in extract
    ie_result = self._real_extract(url)
  File "/usr/lib/python3.10/site-packages/youtube_dl/extractor/pornhub.py", line 400, in _real_extract
    js_vars = extract_js_vars(
  File "/usr/lib/python3.10/site-packages/youtube_dl/extractor/pornhub.py", line 337, in extract_js_vars
    assignments = self._search_regex(
  File "/usr/lib/python3.10/site-packages/youtube_dl/extractor/common.py", line 1013, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract encoded url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

2 - With

$ py youtube_dl/__main__.py --verbose 'https://www.pornhub.com/view_video.php?viewkey=ph61a74c98caef3'
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.pornhub.com/view_video.php?viewkey=ph61a74c98caef3']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: 0b786108b
[debug] Python version 3.10.10 (CPython) - Linux-6.2.8-arch1-1-x86_64-with-glibc2.37
[debug] exe versions: ffmpeg 6.0, ffprobe 6.0, rtmpdump 2.4
[debug] Proxy map: {}
[PornHub] ph61a74c98caef3: Downloading pc webpage
[PornHub] ph61a74c98caef3: Downloading m3u8 information
[PornHub] ph61a74c98caef3: Downloading m3u8 information
[PornHub] ph61a74c98caef3: Downloading m3u8 information
[PornHub] ph61a74c98caef3: Downloading JSON metadata
WARNING: unable to extract view count; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://cv-h.phncdn.com/hls/videos/202112/01/398949561/,720P_4000K,480P_2000K,240P_1000K,_398949561.mp4.urlset/index-f1-v1-a1.m3u8?v098K-Ule78xPnJI1a2j0bOy7sWumUk5TbFxBwr-2UVDc28Y0C18q3F3WlFQmGElOSWLu_akFFaKcFkqb67w4T5fp6dhwhfG9WAac49nhA-asHT726wxzrmRh5sReuRL4Bm-jgkz8dx12qrzZBJnuRcsr3uIPubYF3qmI-2lM4wxG3zgJmvyC25WrvFBOZ6VLGt_gS8OcA'
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 126
[download] Destination: アニメ声 量産型にオモチャ責め-ph61a74c98caef3.mp4
[download] 100% of 109.46MiB in 00:18
[debug] ffmpeg command line: ffprobe -show_streams 'file:アニメ声 量産型にオモチャ責め-ph61a74c98caef3.mp4'
[ffmpeg] Fixing malformed AAC bitstream in "アニメ声 量産型にオモチャ責め-ph61a74c98caef3.mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel repeat+info -i 'file:アニメ声 量産型にオモチャ責め-ph61a74c98caef3.mp4' -c copy -f mp4 -bsf:a aac_adtstoasc 'file:アニメ声 量産型にオモチャ責め-ph61a74c98caef3.temp.mp4'

Tests are broken because of a view count problem. I didn't investigate since it's not related to that PR directly.

@dirkf
Copy link
Contributor

dirkf commented Mar 29, 2023

Are you also seeing yt-dlp/yt-dlp#4299?

@arobase-che
Copy link
Contributor Author

That's curious. I didn't check out yt-dlp as I assume that they will get this fix ported from here.

But I suspect a geo-location issues as it mentions French ISP. Since I'm also in France, I will try from another country.
I will keep you updated.

@arobase-che
Copy link
Contributor Author

Ok. So it's effectively a geo-location dependent behavior.
My patch correctly fix the extractor for me but it may breaks someone else extractor.

Is there is a location behavior aware mechanism already implemented we can import to fix that thing ?

@dirkf
Copy link
Contributor

dirkf commented Mar 29, 2023

What happens if you set both the original and the ...XX cookies?

@arobase-che
Copy link
Contributor Author

It just works. ^^

self._set_cookie(host, 'age_verified', '1')

if 'thumbzilla.com' in host:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try the version from yt-dlp/yt-dlp#4299 (comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I let my comment about why we set multiple cookies. ^^
Feel free to edit it.

@lenuxfrance
Copy link

Hello, how can I modify my file?
what is the access path on linux
thanks

@archusXIV
Copy link

archusXIV commented Apr 3, 2023

@lenuxfrance
from arch linux, As root in /usr/lib/python3.10/site-packages/yt_dlp/extractor/pornhub.py
lines 272 273

@arobase-che
Copy link
Contributor Author

Can it be merged? Or does it need some modification ?

@dirkf dirkf changed the title Rename the age verification cookie [PornHub] Rename the age verification cookie Apr 7, 2023
@dirkf
Copy link
Contributor

dirkf commented Apr 7, 2023

From the discussion in the yt-dlp thread, the 3 new code lines should be moved to the beginning of the _login() method, after the test for already being logged in. This should fix the equivalent problem for playlists too.

@arobase-che
Copy link
Contributor Author

Good idea ! I will do it by the day.

@arobase-che
Copy link
Contributor Author

I can confirm it work even for playlist and model page (from French IP).

@arobase-che
Copy link
Contributor Author

TZ Cookie doesn't seems to be needed as stated in the yt-dlp patch.

Can we merge it ?

Co-authored-by: bashonly <bashonly@bashonly.com>
Co-authored-by: Noah <nkempers@outlook.de>
@arobase-che
Copy link
Contributor Author

I adjust the commit to the version of yt-dlp and added two co-authors since they contribute at least as much as me on this PR.

The modification is about a new location (Amsterdam ?) that use another cookie name.

It's merged on the yt-dlp side.

@arobase-che
Copy link
Contributor Author

Any plan of merging this ?

@arobase-che
Copy link
Contributor Author

Hello ?

If I can be of any help, dont hesitate.

@arobase-che
Copy link
Contributor Author

arobase-che commented May 31, 2023

One month old, no news after multiple comments.
I'm out.

@dirkf dirkf mentioned this pull request Aug 23, 2023
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants