Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YouTube: creator field sometimes wrong #3155

Open
lhoenig opened this issue Oct 6, 2023 · 2 comments
Open

YouTube: creator field sometimes wrong #3155

lhoenig opened this issue Oct 6, 2023 · 2 comments

Comments

@lhoenig
Copy link

lhoenig commented Oct 6, 2023

I noticed that sometimes when saving YouTube videos on Desktop to Zotero, the Creator field is populated with the wrong channel name. I have traced this to the selectors for the author, which are #meta-contents #text-container .ytd-channel-name and alternatively #text-container .ytd-channel-name, and indeed there are many different matches for these, and one of them seems to get selected at random, but only in certain situations that I have not been able to reliably reproduce. When it happens, reloading the page and re-saving will use the correct channel metadata.

@manu-torres
Copy link

Also facing the same issue. Wondering if it has anything to do with the anti-adblock changes, since for me the problem begun on the same day I started seeing the error pages. Maybe just a coincidence...

@hartman
Copy link

hartman commented Dec 8, 2023

The proper creator is in the head's metadata as item prop's and in the application/ld+json. There is a note in the json translator at:
https://github.com/zotero/translators/blob/master/YouTube.js#L106
which mentions that this information is not updated when people click from video to video, and that that is why it scrapes it from the rendered page.

YouTube has however changed their page layout, so the #meta-contents #text-container .ytd-channel-name, where #meta-contents is basically the metadata area underneath the primary video, no longer has any content. The area is still in the page (probably scheduled to be removed once everyone has the 'new' UI with the new metadata area). The translator thus falls back to #text-container .ytd-channel-name which is WAY too generic and used by any channel name on the entire page (and apparently Google engineers don't know when to use ids and when to use classes sigh....)

The current correct selector would be #below #text-container .ytd-channel-name or a bit more targeted .watch-active-metadata #text-container .ytd-channel-name I think. But this is still pretty shaky and in danger of breaking. I believe Zotero translators no longer run inside the browser ? If that is the case, it would make more sense to properly extract the json+ld and use that, or another micro format metadata.

For application/ld+json, which the translator also parses, it should be noted that YT now has TWO such sections on the page, which might also cause some problems for the current translator.

but only in certain situations that I have not been able to reliably reproduce.

This is likely a race condition in page loading, almost the entire page of YouTube is loaded via javascript these days and the async execution makes it so that not every .ytd-channel-name will be available at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants