Skip to content

goutomroy/facebook_scrape_live_gaming_page

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Task

Scrap facebook live gaming page and extracts live users and push them to faktory worker to parse their detail(name, uid, username, number of follower, number of likes, contact details - email, social links) and their posts(post-id, text, datetime, hashtags, links, images) and finally save them in MongoDB.

Setup

  • Install faktory server.
  • Install MongoDB and create database aggero_fb and two collection user_details and posts.
  • pip install -r requirements.txt
  • Change URL_FACTORY password in utils.py file.

Run

  • Navigate to main package.

  • First run python3 consumer.py

  • Then python3 producer.py -nup 10 -nps 3. You must need to provide -nup and -nps

    • -nup - Number of live users to parse
      • Choices : int value greater than 0 or str value all
    • -nps - Number of scroll while parsing user posts. Every scroll gives about 18 posts.
      • Choices : int value greater than 0 or str value all
    • -nup 10 -nps 3 works best for testing, all may be used in production.
  • Scraping should be started now.

Todo

  • Configure Tor for proxy.
  • Build a error database and send daily error report email to admin.
  • By default I am running one worker process but you can modify it in consumer.py file as number of cores in your pc.
  • Create index for MongoDB.
  • Write tests.