Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw error when redirecting to authwall #134

Open
acanimal opened this issue Sep 10, 2020 · 2 comments · May be fixed by #136
Open

Throw error when redirecting to authwall #134

acanimal opened this issue Sep 10, 2020 · 2 comments · May be fixed by #136

Comments

@acanimal
Copy link

It is posible your cookie credentials become invalid and LinkedIn redirects to the "authwall" where you need to login again.

The current code simple returns an empty profile object that generates an error like Cannot read property 'name' of undefined at module.exports (xxx/node_modules/scrapedin/src/profile/cleanProfileData.js:5:23)

At least for me, in that cases, it's necessary to know if the profile has failed due auth error and because of this I have modified slightly the profile.js file with the next lines:

module.exports = async (browser, cookies, url, waitTimeToScrapMs = 500, hasToGetContactInfo = false, puppeteerAuthenticate = undefined) => {
  ...
  const page = await openPage({ browser, cookies, url, puppeteerAuthenticate })

  let authwall = false;
  page.on('response', response => {
    const status = response.status()
    if ((status >= 300) && (status <= 399)) {
      const location = response.headers()['location'];
      if (location.includes('authwall')){
        authwall = true;
      }
    }
  })

  const profilePageIndicatorSelector = '.pv-profile-section'
  await page.waitFor(profilePageIndicatorSelector, { timeout: 5000 })
    .catch(() => {
      //why doesn't throw error instead of continuing scraping?
      //because it can be just a false negative meaning LinkedIn only changed that selector but everything else is fine :)
      logger.warn('profile selector was not found')
    })

  // If redirect to authwall is detected throw error
  if (authwall) {
    const msg = 'Redirected to authwall :( You need new credentials';
    logger.warn(msg);
    throw new Error(msg);
  }

  ...

I don't know if this is something you want to integrate in the project. If so, let me know and I will send a PR.

Thanks in advance.

@leonardiwagner
Copy link
Member

send a PR for sure, I'm very busy realocating right now, however there are more people to review and approve it, once that's done I'll just publish the npm package.

Thank you.

@acanimal acanimal linked a pull request Sep 21, 2020 that will close this issue
@acanimal
Copy link
Author

Done #136

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants