Scrape not working? #163

JarJarBeatyourattitude · 2023-04-21T00:47:53Z

I wasn't getting any results from scrape, so I tried with headless=False. I noticed that search wasn't returning any results, I assume since you need an account to search. I confirmed that the links work in my browser where I'm signed in. Will the script be fixed, or am I missing something? Thanks.

fjj-088 · 2023-04-24T13:25:02Z

I also encountered the same problem.

BradKML · 2023-04-27T09:17:53Z

Is the same thing happening to other scrapers? Might want to keep an eye.

NicerWang · 2023-04-28T07:40:04Z

It's twitter's new restriction, now you need to login before searching.

call utils.init_driver to get a driver
call utils.log_in to login
pass driver to scrape()
(Need to modify scrape() in scweet.py to use passed driver instead of init a new one)

yisyed · 2023-04-29T16:07:30Z

It's twitter's new restriction, now you need to login before searching.

1. call utils.init_driver to get a `driver`

2. call utils.log_in to login

3. pass `driver` to scrape()
   (**Need to modify [scrape() in scweet.py](https://github.com/Altimis/Scweet/blob/76e7086a725980dbd5cf8d46bfc27bd4c1d6816f/Scweet/scweet.py#L71)** to use passed `driver` instead of init a new one)

Can you explain a bit more on how and what are we supposed to change.

NicerWang · 2023-04-29T16:19:48Z

In Your Code (Add Your Twitter Account to .env File In Advance)

from Scweet.scweet import scrape
from Scweet.utils import init_driver, log_in
driver = init_driver(headless=True, show_images=False, proxy="your_proxy_setting")
log_in(driver, env=".env")
data = scrape(..., driver=driver)

In scrape() of scweet.py

def scrape(..., driver=None):
    ......
    # Remove This Line (71)
    # driver = init_driver(headless, proxy, show_images)

yisyed · 2023-04-30T12:24:54Z

In Your Code (Add Your Twitter Account to .env File In Advance)

from Scweet.scweet import scrape
from Scweet.utils import init_driver, log_in
driver = init_driver(headless=True, show_images=False, proxy="your_proxy_setting")
log_in(driver, env=".env")
data = scrape(..., driver=driver)

In scrape() of scweet.py

def scrape(..., driver=None):
    ......
    # Remove This Line (71)
    # driver = init_driver(headless, proxy, show_images)

It works! Thanks.

MykhailoYampolskyi · 2023-05-02T12:42:26Z

In Your Code (Add Your Twitter Account to .env File In Advance)

from Scweet.scweet import scrape
from Scweet.utils import init_driver, log_in
driver = init_driver(headless=True, show_images=False, proxy="your_proxy_setting")
log_in(driver, env=".env")
data = scrape(..., driver=driver)

In scrape() of scweet.py

def scrape(..., driver=None):
    ......
    # Remove This Line (71)
    # driver = init_driver(headless, proxy, show_images)

Hi, I am new to this, could you tell where do I add .env file? Thanks

yisyed · 2023-05-02T16:40:10Z

In Your Code (Add Your Twitter Account to .env File In Advance)

from Scweet.scweet import scrape
from Scweet.utils import init_driver, log_in
driver = init_driver(headless=True, show_images=False, proxy="your_proxy_setting")
log_in(driver, env=".env")
data = scrape(..., driver=driver)

In scrape() of scweet.py

def scrape(..., driver=None):
    ......
    # Remove This Line (71)
    # driver = init_driver(headless, proxy, show_images)

Hi, I am new to this, could you tell where do I add .env file? Thanks

It should be in your project's folder (NOTE: the file name should be '.env').

Your '.env' should be in the format given below:

SCWEET_EMAIL = "_example@email.com_"
SCWEET_PASSWORD = "_password_"
SCWEET_USERNAME = "_username_"

Below are the steps and changes I have made:

I have added 'env=".env"'
data = scrape(..., env=".env")
In scrape() of 'scweet.py':

def scrape(..., env=None):    # Add this 'env=None'
    ......
    # And add this line after line (71)
    log_in(driver, env)

NOTE: My method is not robust. If you can find a better way to scrape tweets, let us know.

yisyed · 2023-05-02T17:44:49Z

In Your Code (Add Your Twitter Account to .env File In Advance)
from Scweet.scweet import scrape
from Scweet.utils import init_driver, log_in
driver = init_driver(headless=True, show_images=False, proxy="your_proxy_setting")
log_in(driver, env=".env")
data = scrape(..., driver=driver)
In scrape() of scweet.py
def scrape(..., driver=None):
    ......
    # Remove This Line (71)
    # driver = init_driver(headless, proxy, show_images)
Hi, I am new to this, could you tell where do I add .env file? Thanks
It should be in your project's folder (NOTE: the file name should be '.env').

Your '.env' should be in the format given below:
SCWEET_EMAIL = "_example@email.com_"
SCWEET_PASSWORD = "_password_"
SCWEET_USERNAME = "_username_"
Below are the steps and changes I have made:
1. I have added 'env=".env"'
   `data = scrape(..., env=".env")`

2. In scrape() of 'scweet.py':
def scrape(..., env=None):    # Add this 'env=None'
    ......
    # And add this line after line (71)
    log_in(driver, env)
NOTE: My method is not robust. If you can find a better way to scrape tweets, let us know.

In scrape() of 'scweet.py':
Edit this import in Line (9) and add 'log_in'
from .utils import ..., log_in

Wish-s · 2023-05-07T13:17:55Z

在您的代码中（提前将您的 Twitter 帐户添加到 .env 文件中）

from Scweet.scweet import scrape
from Scweet.utils import init_driver, log_in
driver = init_driver(headless=True, show_images=False, proxy="your_proxy_setting")
log_in(driver, env=".env")
data = scrape(..., driver=driver)

在 scweet.py 的 scrape() 中

def scrape(..., driver=None):
    ......
    # Remove This Line (71)
    # driver = init_driver(headless, proxy, show_images)

Hello, I am new to this too, could you tell where can I attain the "your_proxy_setting"? Thanks very much！

yisyed · 2023-05-07T15:16:05Z

在您的代码中（提前将您的 Twitter 帐户添加到 .env 文件中）
from Scweet.scweet import scrape
from Scweet.utils import init_driver, log_in
driver = init_driver(headless=True, show_images=False, proxy="your_proxy_setting")
log_in(driver, env=".env")
data = scrape(..., driver=driver)
在 scweet.py 的 scrape() 中
def scrape(..., driver=None):
    ......
    # Remove This Line (71)
    # driver = init_driver(headless, proxy, show_images)
Hello, I am new to this too, could you tell where can I attain the "your_proxy_setting"? Thanks very much！

Try following the method I have given above. It works for me.
I have kept everything the same in scrap() of scweet.py on line (71) (the proxy is 'None' by default).
If it still doesn't work, let me know what's the error. Thanks.

Note: I have to restart my VScode every time I make a change in the Scweet library.

NicerWang · 2023-05-09T02:22:14Z

@Wish-s
If you do not need a proxy(or VPN) to connect to twitter.com, just remove this parameter.

Wish-s · 2023-05-09T09:29:09Z

@Wish-s If you do not need a proxy(or VPN) to connect to twitter.com, just remove this parameter.

Thank you for your reply. I need a a proxy(or VPN) to connect to twitter.com, but I can't find where to obtain the parameter.

NicerWang · 2023-05-09T11:56:45Z

@Wish-s
It's decided by your proxy software, in the format "PROTOCOL://IP:PORT".
For clash, it use "http://127.0.0.1:7890" as default.

ihabpalamino · 2023-07-12T15:50:40Z

hello guy this is my code from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from Scweet.scweet import scrape

Specify the parameters for scraping

username = "2MInteractive"
since_date = "2023-07-01"
until_date = "2023-07-11"
headless = True

Set up the ChromeDriver service

service = Service("C:/Users/HP Probook/Downloads/chromedriver.exe") # Replace with the actual path to chromedriver

Set up the ChromeOptions

options = webdriver.ChromeOptions()
options.headless = headless

Create the WebDriver

driver = webdriver.Chrome(service=service, options=options)

Scrape the tweets by username

data = scrape(from_account=username, since=since_date, until=until_date, headless=headless, driver=driver)

Print the scraped data

print(data)

Close the WebDriver

driver.quit()
and i am having empty datalist looking for tweets between 2023-07-01 and 2023-07-06 ...
path : https://twitter.com/search?q=(from%3A2MInteractive)%20until%3A2023-07-06%20since%3A2023-07-01%20&src=typed_query
scroll 1
scroll 2
looking for tweets between 2023-07-06 and 2023-07-11 ...
path : https://twitter.com/search?q=(from%3A2MInteractive)%20until%3A2023-07-11%20since%3A2023-07-06%20&src=typed_query
scroll 1
scroll 2
Empty DataFrame
Columns: [UserScreenName, UserName, Timestamp, Text, Embedded_text, Emojis, Comments, Likes, Retweets, Image link, Tweet URL]
Index: []

baqachadil · 2023-07-18T13:19:26Z

check this solution, it might work if none of the others worked #169 (comment)

BradKML mentioned this issue Apr 30, 2023

Twitter searches fail with blocked (403) JustAnotherArchivist/snscrape#846

Closed

G-walk mentioned this issue May 7, 2023

Scrape can't get anything #165

Open

YGH16 mentioned this issue May 30, 2023

Scraper doesn't return any results #167

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scrape not working? #163

Scrape not working? #163

JarJarBeatyourattitude commented Apr 21, 2023

fjj-088 commented Apr 24, 2023

BradKML commented Apr 27, 2023

NicerWang commented Apr 28, 2023

yisyed commented Apr 29, 2023

NicerWang commented Apr 29, 2023

yisyed commented Apr 30, 2023

MykhailoYampolskyi commented May 2, 2023

yisyed commented May 2, 2023 •

edited

yisyed commented May 2, 2023 •

edited

Wish-s commented May 7, 2023

yisyed commented May 7, 2023 •

edited

NicerWang commented May 9, 2023

Wish-s commented May 9, 2023

NicerWang commented May 9, 2023 •

edited

ihabpalamino commented Jul 12, 2023

baqachadil commented Jul 18, 2023 •

edited

Scrape not working? #163

Scrape not working? #163

Comments

JarJarBeatyourattitude commented Apr 21, 2023

fjj-088 commented Apr 24, 2023

BradKML commented Apr 27, 2023

NicerWang commented Apr 28, 2023

yisyed commented Apr 29, 2023

NicerWang commented Apr 29, 2023

yisyed commented Apr 30, 2023

MykhailoYampolskyi commented May 2, 2023

yisyed commented May 2, 2023 • edited

yisyed commented May 2, 2023 • edited

Wish-s commented May 7, 2023

yisyed commented May 7, 2023 • edited

NicerWang commented May 9, 2023

Wish-s commented May 9, 2023

NicerWang commented May 9, 2023 • edited

ihabpalamino commented Jul 12, 2023

Specify the parameters for scraping

Set up the ChromeDriver service

Set up the ChromeOptions

Create the WebDriver

Scrape the tweets by username

Print the scraped data

Close the WebDriver

baqachadil commented Jul 18, 2023 • edited

yisyed commented May 2, 2023 •

edited

yisyed commented May 2, 2023 •

edited

yisyed commented May 7, 2023 •

edited

NicerWang commented May 9, 2023 •

edited

baqachadil commented Jul 18, 2023 •

edited