Not scraping every tweet from a user #52

wjd157 · 2022-11-28T15:57:18Z

Hello, I am trying to scrape every tweet from a user. From the twitter page, I can see that they have tweeted more than 5000 times. However, even when I set my tweets_count to 5000, I am getting less than 1000 tweets from that user.

My code is below:

scrape_profile(twitter_username = "elonmusk", output_format ="csv", tweets_count = 6000, browser = "chrome", filename = "elonmusk")

(Note that @ElonMusk is just a stand-in example)

shaikhsajid1111 · 2022-12-03T04:17:14Z

Hey @wjd157, that method uses browser automation for scraping and your tweet count is big so it might be getting blocked in between. I suggest you use the scrape_keyword_with_api() method for scraping. Try the below code, and check elon.json after scraping you will get the data you want

from twitter_scraper_selenium import scrape_keyword_with_api

scrape_keyword_with_api('from:elonmusk', output_filename='elon')

wjd157 · 2022-12-13T19:03:30Z

This appears to generate a JSON file with no data in it. Further, it the console tells me I have only scraped 24 tweets even though the account I am now trying has more than 200 tweets.

shaikhsajid1111 · 2022-12-14T02:40:34Z

Okay, I think this feature of Twitter only returns few tweets. Currently, I have not added feature to scrape Twitter account from Twitter's API, and the one with the browser automation get's blocked. I will add a new feature to scrape Twitter's profile from the API in a couple of weeks

christianmettri · 2022-12-24T19:18:52Z

I am also highly looking forward to this feature. Please let us know once you had time to implement this. Thanks a lot.

shaikhsajid1111 · 2022-12-31T06:05:20Z

Hi @christianmettri @wjd157 , Just updating you about it, don't know if you're still looking for the solution. Now, you can try

from twitter_scraper_selenium import scrape_profile_with_api

scrape_profile_with_api('elonmusk', output_filename='musk', tweets_count= 100)

and check musk.json file where the output will be saved

SenninOne · 2023-02-28T08:39:26Z

Hello @shaikhsajid1111 I tried this code and it gives me this error:

2023-02-28 02:33:09,836 - WARNING - Failed to make request!

The code:

from twitter_scraper_selenium import scrape_profile_with_api
import json

scrape_profile_with_api(username="NASA", output_filename="NASA", browser="firefox",tweets_count=50, output_dir="C:/Users/Braulio/Desktop/web scraping python")


with open('NASA.json') as f:
    NASA = json.load(f)


with open('NASAimages.html', 'w') as f:
    f.write('<html>\n')
    f.write('<head>\n')
    f.write('<title>Imágenes</title>\n')
    f.write('</head>\n')
    f.write('<body>\n')
    for tweet_id, tweet_data in caro.items():
        if tweet_data['username'] == 'NASA':
            for imagen in tweet_data['images']:
                f.write('<img src="{}" format=jpg&name=medium" alt="">\n'.format(imagen))
    f.write('</body>\n')
    f.write('</html>\n')

print("HTML READY")

I also tried with the function scrape_keyword_with_api, here is the code:


from twitter_scraper_selenium import scrape_keyword_with_api
import json

scrape_keyword_with_api(query="from:NASA", output_filename="NASA", tweets_count=50, output_dir="C:/Users/Braulio/Desktop/web scraping python")


with open('NASA.json') as f:
    NASA = json.load(f)


with open('imagenes.html', 'w') as f:
    f.write('<html>\n')
    f.write('<head>\n')
    f.write('<title>Imágenes</title>\n')
    f.write('</head>\n')
    f.write('<body>\n')
    for tweet_id, tweet_data in NASA.items():
        if tweet_data['username'] == 'NASA':
            for imagen in tweet_data['images']:
                f.write('<img src="{}" format=jpg&name=medium" alt="">\n'.format(imagen))
    f.write('</body>\n')
    f.write('</html>\n')

print("HTML READY")

It shows this error:

2023-02-28 02:37:18,021 - twitter_scraper_selenium.keyword_api - WARNING - Failed to make request!

shaikhsajid1111 mentioned this issue Dec 31, 2022

V4.1.2 Updates #54

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not scraping every tweet from a user #52

Not scraping every tweet from a user #52

wjd157 commented Nov 28, 2022

shaikhsajid1111 commented Dec 3, 2022 •

edited

wjd157 commented Dec 13, 2022

shaikhsajid1111 commented Dec 14, 2022

christianmettri commented Dec 24, 2022

shaikhsajid1111 commented Dec 31, 2022

SenninOne commented Feb 28, 2023 •

edited

Not scraping every tweet from a user #52

Not scraping every tweet from a user #52

Comments

wjd157 commented Nov 28, 2022

shaikhsajid1111 commented Dec 3, 2022 • edited

wjd157 commented Dec 13, 2022

shaikhsajid1111 commented Dec 14, 2022

christianmettri commented Dec 24, 2022

shaikhsajid1111 commented Dec 31, 2022

SenninOne commented Feb 28, 2023 • edited

shaikhsajid1111 commented Dec 3, 2022 •

edited

SenninOne commented Feb 28, 2023 •

edited