Tweet Analysis 2023

If there is any last minute Twitter data collection we can do before access shuts off, we will do it.

This repo expands upon the previous approach, with improvements such as storing the tweet language and matching rule identifiers.

Installation

Make a copy of this repo. Clone / download your copy of the repo onto your local computer (e.g. the Desktop) then navigate there from the command-line:

cd ~/Desktop/tweet-analysis-2023

Setup a virtual environment:

conda create -n tweets-2023 python=3.10

Activate virtual environment:

conda activate tweets-2023

Install packages:

pip install -r requirements.txt

Services Setup

Google Credentials

Create your own Google APIs project, obtain JSON credentials file for a service account, and download it to the root directory of this repo, naming it specifically "google-credentials.json".

Twitter API Credentials

Obtain Twitter API credentials from the Twitter Developer Portal (i.e. TWITTER_BEARER_TOKEN below). Ideally research level access.

Sendgrid API Credentials

Obtain Sendgrid API credentials from the Sendgrid website (i.e. SENDGRID_API_KEY below). Create a sender identity and verify it (i.e. SENDER_ADDRESS below).

Google Cloud Storage

If you would like to save files to cloud storage, create a new bucket or gain access to an existing bucket, and set the BUCKET_NAME environment variable accordingly (see environment variable setup below).

Database Setup

You can use a SQLite database, or a BigQuery database.

BigQuery Setup

If you want to use a Bigquery database, in the respective Google APIs project, setup a new BigQuery dataset for each new collection effort. Consider creating two datasets, one for development and one for production.

The DATASET_ADDRESS environment variable will be a namespaced combination of the google project name and the dataset name (i.e. "my-project.my_dataset_development").

Configuration

Create a new local ".env" file and set environment variables to configure services, as desired:

# this is the ".env" file..

#
# GOOGLE APIS
#
# path to the google credentials file you downloaded
GOOGLE_APPLICATION_CREDENTIALS="/Users/path/to/tweet-analysis-2022/google-credentials.json"

#
# GOOGLE BIGQUERY
#
DATASET_ADDRESS="my-project.my_database_name"

#
# GOOGLE CLOUD STORAGE
#
BUCKET_NAME="my-bucket"

#
# TWITTER API
#
TWITTER_BEARER_TOKEN="..."

#
# SENDGRID API
#
SENDGRID_API_KEY="SG.___________"
SENDER_ADDRESS="example@gmail.com"

Usage

Services

Demonstrate ability to fetch data from BigQuery, as desired:

python -m app.bq_service

Demonstrate ability to fetch data from Twitter API, as desired:

python -m app.twitter_service

Demonstrate ability to send email, as desired:

python -m app.email_service

Demonstrate ability to connect to cloud storage, as desired:

python -m app.cloud_storage

Jobs

Search:

Streaming:

Tweet Collection (Streaming)

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
app		app
data		data
notebooks		notebooks
test		test
.gitignore		.gitignore
DEPLOYING.md		DEPLOYING.md
Procfile		Procfile
QUERIES.md		QUERIES.md
README.md		README.md
requirements.txt		requirements.txt

s2t2/tweet-analysis-2023

Folders and files

Latest commit

History

Repository files navigation

Tweet Analysis 2023

Installation

Services Setup

Google Credentials

Twitter API Credentials

Sendgrid API Credentials

Google Cloud Storage

Database Setup

BigQuery Setup

Configuration

Usage

Services

Jobs

About

Resources

Stars

Watchers

Forks

Languages