Skip to content

ReagentX/gtrends

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gtrends: A Google Trends Analytics Package

Inspired by this reddit post, I wanted to build a simple platform to pull data from Google Trends. The author made the interesting choice not to normalize the data (i.e., that the value of popularity does not represent the same amount of search volume across each category).

While this means that it does not compare the actual popularity of the social networks, it does elucidate where the networks' popularity happens to spike in the context of other networks' popularity. I built in an option that allows you to specify whether to normalize or not.

The get() function is multiprocessed, thus multiple keywords will be handled concurrently to speed up data collection routines. The number of processes depends on the number of keywords passed, however, Google may rate limit overly ambitious requests. The script checks if the requested data already exists in the output folder before making new queries.

Normalized

Normalized

Not Normalized

Not Normalized

Usage

For an example on how to use this script see run.py in the scripts folder. In short, define a list of keywords:

keywords = ['Facebook', 'Instagram', 'Twitter', 'Google Plus', 'Reddit']

Determine if you want to normalize:

normalize = False

Generate a GoogleTrendsData object:

gt = generate.GoogleTrendsData(keywords, normalize)

Run gt.get():

data = gt.get()

This will return a Pandas DataFrame. Happy analyzing!

Built-in Functions

gt.save(data)

Will export the data to a CSV file. This is useful because the script checks the output folder before making new queries, thus reducing the likelihood of hitting the rate limit.

gt.grapgh(data, file_name)

Will save an image of the line graph using pandas.plot().


Notes

  • Google only allows you to normalize up to five keywords at once. Trying to normalize more than 5 keywords will raise a ValueError.
  • Google may rate limit you if you make too many requests which can lead to TypeErrors or ResponseErrors.
  • This package is dependent on pandas and pytrends which can be installed via pip.