Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FYI this isn't working very well with youtube at the moment #355

Open
fusir opened this issue Mar 20, 2024 · 6 comments
Open

FYI this isn't working very well with youtube at the moment #355

fusir opened this issue Mar 20, 2024 · 6 comments

Comments

@fusir
Copy link

fusir commented Mar 20, 2024

Particularly it is doing a poor job of finding a reasonable title and doesn't have consistent results.

I have tried it both in NodeJS and on the demo page of the website and got the same kinds of bad titles back.

There a two main kinds of errant titles. One just returns "Youtube" when requesting for a specific video. The other just returns the pathname of url.

Example URL: https://www.youtube.com/watch?v=p24KbTBR3QE

Should result it: "POV: you return to office [outtakes]"

You can achieve the bug by running this code:

const microlink = require('@microlink/mql');
const url = "https://www.youtube.com/watch?v=p24KbTBR3QE";
const {data} = await microlink(url);
console.log({title:data.title});

You can also see the same bug on the website here: https://microlink.io/meta

The bug is not consistent.

One possible solution would deviate from the headless browser as a service concept but you could in the case of youtube just use their API. That's what I'll be doing in the meantime.

@Kikobeats
Copy link
Member

Hello, thanks for reaching. Microlink is using a headless browser under the hood. I think this is happening because we are sending too much traffic to YouTube and they shadow ban us for a while.

It needs to be investigated, thanks for reporting, I will report back after a better understanding of what's happening.

@Kikobeats
Copy link
Member

@fusir is it working better these days? We implemented changes 🙂

@coffeewithdonut
Copy link

@Kikobeats I'm still seeing issues with this; it seems to pull video titles/descriptions sporadically. Otherwise, defaults back to – YouTube / Share your videos with friends, family, and the world

@Kikobeats
Copy link
Member

@coffeewithdonut did you try is passing prerender parameter is working better?

In that way it's going to ensure to use a headless browser:
https://microlink.io/docs/api/parameters/prerender

@coffeewithdonut
Copy link

coffeewithdonut commented Apr 22, 2024

@Kikobeats Yes, I've had more success with prerender set to false for YouTube links because Google seems to intercept the headless browser requests. It also seems like there is a cache layer that prerender false fetches from, popular videos tend to work, or videos I've fetched previously, but new videos initially return responses as below.

Here are some example responses

Prerender false:

{
  lang: null,
  author: null,
  title: 'watch?v=Y25LDO6OLzQ',
  publisher: 'YouTube',
  image: {
    url: 'https://img.youtube.com/vi/Y25LDO6OLzQ/maxresdefault.jpg',
    type: 'jpg',
    size: 44575,
    height: 720,
    width: 1280,
    size_pretty: '44.6 kB'
  },
  date: '2024-04-22T01:05:06.000Z',
  description: null,
  url: 'https://www.youtube.com/watch?v=Y25LDO6OLzQ',
  logo: {
    url: 'https://www.youtube.com/favicon.ico',
    type: 'ico',
    size: 1150,
    height: 16,
    width: 16,
    size_pretty: '1.15 kB'
  }
}

Prerender true:

{
  lang: null,
  author: null,
  title: 'index?continue=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DY25LDO6OLzQ&q=EgSG0aUUGIvulrEGIjAJjJxx2b-Y1a2_sb7ANF15KR3j8TezYD32d6H29gmpBoNh0pu6xWEuejAFvHcRpfsyAXJaAUM',
  publisher: 'YouTube',
  image: null,
  date: '2024-04-22T01:04:33.000Z',
  video: null,
  description: null,
  url: 'https://www.google.com/sorry/index?continue=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DY25LDO6OLzQ&q=EgSG0aUUGIvulrEGIjAJjJxx2b-Y1a2_sb7ANF15KR3j8TezYD32d6H29gmpBoNh0pu6xWEuejAFvHcRpfsyAXJaAUM',
  logo: {
    url: 'https://www.google.com/favicon.ico',
    type: 'ico',
    size: 5430,
    height: 16,
    width: 16,
    size_pretty: '5.43 kB'
  }
}

@Kikobeats
Copy link
Member

@fusir @coffeewithdonut We're continuing effort to fix this, can you check now and feedback us if it's working better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants