Skip to content

Get People Also Ask (PAA) questions from Google SERPs with Puppeteer (2019)

License

Notifications You must be signed in to change notification settings

jpigla/PAAs-from-SERPs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Get PAAs from Google SERPs

GitHub node npm GitHub last commit

⚠ Disclaimer

This software is not authorized by Google and doesn't follow Google's robots.txt. Scraping without Google explicit written permission is a violation of thei terms and conditions on scraping and can potentially cause a lawsuit

Requirements

Local Environment

NPM-Packages

Installation

  1. Download latest project release, extract and (if desired) move folder to your home directory
  2. Check if Node and NPM are already installed. Open Terminal and ...
  • type node -v in Terminal to check NodeJS version number (and if installed already)
  • type npm -v in Terminal to check NPM-Manager version number (and if installed already)
  • if not, install Homebrew (from https://brew.sh/index_de; Mac) and then NodeJS with brew update && brew install node
  1. In Terminal move to project folder (type cd folder/ if you named the project folder "folder")
  2. Install required NPM packages, type npm install in Terminal

Usage

Type npm run scraper -- --help for help (or read on).

Run script with arguments with one of the following commands

  • npm run scraper -- --clicks=[0-2/max] --kw=[...] --lang=[de/en] (--output=csv)
  • node get_paas.js --clicks=[0-2/max] --kw=[...] --lang=[de/en] (--output=csv)

Arguments

  • --clicks=[0-2/max] : how often click on new questions [0-2/max] (be patient when using max, ~3min)
  • --kw=[...] : input of keyword (search term) or "keywords" for batch mode (read line by line keywords from keywords.txt)
  • --lang=[de/en] : choose languange of google search [de/en]
  • --output=csv : (optional) to export list of questions

Examples

  • npm run scraper
    • -- --clicks=max --kw=firefox --output=csv --lang=en
    • -- --clicks=0 --kw=angela+merkel --lang=de
    • -- --clicks=0 --output=csv --kw=keywords --lang=en (batch mode)
  • node get_paas.js
    • --clicks=max --kw=firefox --output=csv --lang=en
    • --clicks=0 --kw=angela+merkel --lang=de
    • --clicks=0 --output=csv --kw=keywords --lang=en (batch mode)

What happens here

  1. Browser goes to https://www.google.com/search?hl=de&gl=DE&ie=utf-8&oe=utf-8&no_sw_cr=1&pws=0&q=[KEYWORD] (default/de)
  2. If clicks is set to 0 initially found questions are returend
  3. If clicks is set > 0 then sets of appearing questions (after clicks) are clicked N times (first set = 4 (initial) questions)
  4. Extract all questions from serp after clicking is done
  5. Output to CLI and CSV file (if csv argument is given)

Help & Information

  • If something breaks or errors occur during runtime, please ask Philipp at hello@jpigla.de.

Changelog

Version 1.1 (15.10.2019)

  • Add npm script
  • Optimize performance
  • Add --help argument
  • Add --lang (language) argument [de/en]
  • Edit readme

Version 1.0 (07.10.2019)

  • Initial upload
  • Working version

License

All assets and code are under the GPL v3 License unless specified otherwise.