Long Response Time - Get Profile #63

0xAskar · 2023-09-27T15:10:06Z

As explained by the title, over the past 24hrs, the getProfile now takes a long time to respond. It doesn't error out, but responds after 5-10 minutes, if not longer. This was working fine 48hrs ago and there have been no changes to my authentication.
I will also add my scraper code below, any help would be greatly appreciated because my system highly depends on the scraper working well. Also, this is happening on my local network and on my heroku servers, so its not a network issue. I've also changed the cookies locally to some new ones from the browser, but with no avail.

I also checked the params set and it includes the usernames. Sometimes it takes 24 seconds. othertime it never responds after 10 minutes. Also, the url is still consistent with the one that shows in the network browser.
https://twitter.com/i/api/graphql/G3KGOASz96M-Qu0nwmGXNg/UserByScreenName?variables

As you can see with the code below, it logs this response: finishing getting scraper info and it took 24.2456 seconds which is the quick time haha

                console.log("getting scraper info")
                let startTime = new Date().getTime()
                twitterData = await scraper.getProfile(user.twitterUsername)
                let endTime = new Date().getTime()
                console.log(`finishing getting scraper info and it took ${(endTime - startTime) / 1000} seconds`)

the scraper code with cookies emitted

import dotenv from 'dotenv';

dotenv.config({ path: '../.env'});
import { HttpsProxyAgent } from 'https-proxy-agent';
import { Scraper } from '@the-convocation/twitter-scraper'

export default async function getScraper(options = { authMethod: 'cookies' }) {
    // const username = process.env['TWITTER_USERNAME'];
    const username = omitted
    // const password = process.env['TWITTER_PASSWORD'];
    const password = omitted
    // const email = process.env['TWITTER_EMAIL'];
    const email = "omitted
    let cookies = [
      {"name": "lang", "value": "en"},
      {"name": "guest_id", "value": "omitted"},
      {"name": "_twitter_sess", "value": "omitted"},
      {"name": "auth_token", "value": "omitted"},
      {"name": "ct0", "value": "omitted"},
      {"name": "guest_id_ads", "value": "omitted"},
      {"name": "guest_id_marketing", "value": "omitted"},
      {"name": "twid", "value": "omitted"},
      {"name": "personalization_id", "value": "omitted"}
    ]
    const proxyUrl = null;
    let agent;
  
    if (options.authMethod === 'cookies' && !cookies) {
      console.warn(
        'TWITTER_COOKIES variable is not defined, reverting to password auth (not recommended)',
      );
      options.authMethod = 'password';
    }
  
    if (options.authMethod === 'password' && !(username && password)) {
      throw new Error(
        'TWITTER_USERNAME and TWITTER_PASSWORD variables must be defined.',
      );
    }
  
    if (proxyUrl) {
      agent = new HttpsProxyAgent(proxyUrl, {
        rejectUnauthorized: false,
      });
    }
  
    const scraper = new Scraper({
      transform: {
        request: (input, init) => {
          if (agent) {
            return [input, { ...init, agent }];
          }
          return [input, init];
        },
      },
    });
  
    if (options.authMethod === 'password') {
      await scraper.login(username, password, email);
    } else if (options.authMethod === 'cookies') {
        const cookieStrings = cookies.map(cookie => `${cookie.name}=${cookie.value}`);
        await scraper.setCookies(cookieStrings);
    }
  
    return scraper;
}

The text was updated successfully, but these errors were encountered:

0xAskar · 2023-09-27T20:41:08Z

I created a new account and got new credentials. the problem is that its obviously unsustainable to do that manually each time. anyone have any solutions?

karashiiro · 2023-09-28T03:01:13Z

The tests still complete in the same time as a couple weeks ago, and given that getting new credentials at least temporarily fixed the problem, I think you're just getting rate limited - there's nothing this library can do about that to the best of my knowledge, but if you find something it can do here I'd be happy to add it. Twitter's servers are effectively a black box to us, so while they may have changed something recently, it's just as likely that the account you're using was flagged and has a stricter rate limit now (maybe? I don't know if that's actually a thing).

Short of that, I'd suggest implementing a secondary throttler on your end. If your application is making rapid-fire requests until it gets rate-limited then possibly space out requests from each other? I don't know what the safest/most efficient way of doing that is but it'll probably be specific to your application.

0xAskar · 2023-09-28T16:49:59Z

@karashiiro Hmm, yeah makes sense. I figured that rate-limiting was also the reason, and I couldn't think of a good way to go about it. For my specific use case, time sensitivity is important. I wonder if there were ways to create twitter accounts and retrieve their cookies automatically, making a new account every time we reach that threshold (I've been querying a lot tbf). I'll keep this open for a little longer, and close it whether I think of a better approach or not

karashiiro · 2023-09-28T18:30:33Z

Automated account creation is a challenge because of the email (and often phone) verification requirements, but it might be possible with a sophisticated enough system. With the current requirements, even creating an account manually is a chore, though.

In a different vein, you might be able to load-balance across multiple scrapers logged-into different accounts, but that also might get flagged more easily unless you proxy them all through different servers to avoid them all having the exact same IP (of course, then you might run into the login location check verification).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long Response Time - Get Profile #63

Long Response Time - Get Profile #63

0xAskar commented Sep 27, 2023

0xAskar commented Sep 27, 2023

karashiiro commented Sep 28, 2023

0xAskar commented Sep 28, 2023

karashiiro commented Sep 28, 2023 •

edited

Long Response Time - Get Profile #63

Long Response Time - Get Profile #63

Comments

0xAskar commented Sep 27, 2023

0xAskar commented Sep 27, 2023

karashiiro commented Sep 28, 2023

0xAskar commented Sep 28, 2023

karashiiro commented Sep 28, 2023 • edited

karashiiro commented Sep 28, 2023 •

edited