Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some websites put meta tags outside the head. #192

Open
paul-rchds opened this issue Apr 13, 2022 · 2 comments
Open

Some websites put meta tags outside the head. #192

paul-rchds opened this issue Apr 13, 2022 · 2 comments

Comments

@paul-rchds
Copy link

On some pages meta tags are included outside of the head tag. For example on the YouTube channel page: https://www.youtube.com/c/Freecodecamp

As the opengraph extractor only looks in the head tag, all the og:* meta properties are missed.
In my fork, I changed the extractor to look in the body rather.

If I get permission, I can do a PR?

Here is a link to where I made the change:

for head in document.xpath('//head'):

@lopuhin
Copy link
Member

lopuhin commented Apr 14, 2022

hi @paul-rchds yes, that would be great - I noticed the same issue myself but didn't get to implement everything required, here is a link to a PR #129 - feel free to start a new one.

@frostrot
Copy link

I have changed the functionality of the extract_item function in OpengraphExtractor class, to incorporate the meta tags outside of the head. Have tested it on the link shared by @paul-rchds . Please review my PR for its workability. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants