Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update captions.py #1860

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

justSam13
Copy link

This PR fixes the error mentioned in #1085

This PR fixes the error mentioned in pytube#1085
@opterix
Copy link

opterix commented Feb 16, 2024

Excellent!

This change in the captions.py file definitely solved the problem for me:
###########
Python 3.11.8
pytube 15.0.0

@justSam13
Copy link
Author

Glad I could help!

@opterix
Copy link

opterix commented Feb 18, 2024

There are still some issues with automatically generated captions.

For instance, this one works quite well:

from pytube import YouTube

video_url = 'https://www.youtube.com/watch?v=gRtjjtBHXxo'
yt = YouTube(video_url)
stream = yt.streams.first()
captions = yt.captions

caption = yt.captions['en']
print(caption.generate_srt_captions())

But this one, still has issues:

from pytube import YouTube

video_url = 'https://www.youtube.com/watch?v=gRtjjtBHXxo'
yt = YouTube(video_url)
stream = yt.streams.first()
captions = yt.captions

caption = yt.captions['a.en'] # Automatically generated transcript
print(caption.generate_srt_captions())

@justSam13
Copy link
Author

justSam13 commented Feb 19, 2024

The issue with auto generated captions is that their xml format is different from the others.

See for instance the auto generated captions have the xml as:
image

while the normal english captions have the xml:
image

The code in captions.py is written to parse the normal captions, so it is unable to do that for the auto generated ones :(
A new function needs to be written to parse those and an if condition to check that if it is an auto generated caption, use the new function else use the original one.

Add a bit code to parse auto-generated captions also.
@justSam13
Copy link
Author

Nvm two lines of code did the job!
Try it and update me :D

@opterix
Copy link

opterix commented Feb 22, 2024

Great. It worked nicely

@justSam13
Copy link
Author

Glad to know! 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants