Skip to content

InnovateEDU-NYC/google_classroom

 
 

Repository files navigation

googleclassroom

Data pipeline for Google Classroom API to BigQuery

Dependencies:

Getting Started

Setup Environment

  1. Clone this repo
git clone https://github.com/InnovateEDU-NYC/google_classroom.git
  1. Install dependencies
  • Docker can be installed directly from the website at docker.com.
  1. Create .env file with project secrets
# Basic Configuration Info
ACCOUNT_EMAIL=Email of admin account that will be used to pull data.
STUDENT_ORG_UNIT=Name of the Google Admin organizational unit for students (optional — filters student usage reports to that organization)
SCHOOL_YEAR_START=YYYY-MM-DD

# Database variables
DB_DIALECT=The type of database you are using. Current options: mssql, postgres, sqlite, and bigquery
DB_SERVER=
DB=
DB_USER=
DB_PWD=
DB_SCHEMA=
DB_CREDENTIALS=service.json

# (Optional) Data Pulls To Enable. Set to "YES" to include that pull.
# These can be left out in favor of command line arguments.
PULL_USAGE=
PULL_COURSES=
PULL_TOPICS=
PULL_COURSEWORK=
PULL_STUDENTS=
PULL_TEACHERS=
PULL_GUARDIANS=
PULL_SUBMISSIONS=
PULL_GUARDIAN_INVITES=
PULL_ALIASES=
PULL_INVITATIONS=
PULL_ANNOUNCEMENTS=

# (Optional) Batch parameters. Can be configured and changed to optimize performance.
# *_BATCH_SIZE is the number of dates or courses to batch at a time. MAX: 1000
# Lower batch sizes are useful for high volume or slow endpoints to avoid timeouts.
ORG_UNIT_BATCH_SIZE=
USAGE_BATCH_SIZE=
COURSES_BATCH_SIZE=
TOPICS_BATCH_SIZE=
COURSEWORK_BATCH_SIZE=
STUDENTS_BATCH_SIZE=
TEACHERS_BATCH_SIZE=
GUARDIANS_BATCH_SIZE=
SUBMISSIONS_BATCH_SIZE=
GUARDIAN_INVITES_BATCH_SIZE=
ALIASES_BATCH_SIZE=
INVITATIONS_BATCH_SIZE=
ANNOUNCEMENTS_BATCH_SIZE=
PAGE_SIZE=

# Email notification variables
# Set DISABLE_MAILER to "YES" if you do not want email notifications to be sent.
DISABLE_MAILER=YES
SENDER_EMAIL=
SENDER_PWD=
RECIPIENT_EMAIL=
# If using a standard Gmail account you can set these to smtp.gmail.com on port 465
EMAIL_SERVER=
EMAIL_PORT=
  1. Enable APIs in Developer Console
  • Navigate to the API library in the developer console.
  • Search for Google Classroom, and Enable it.
  • Search for Admin SDK, and Enable it.
  1. Create a service account.
  • In the Google Developer Console (console.developers.google.com), go to Credentials.
  • Click on "Create Credentials -> Service Account"
  • Create a name for your service account.
  • Select the "Owner" role for the service account.
  • Create a key, saving the result file as service.json.
  • Check the box for "Enable G Suite Domain-Wide Delegation"
  • Click "Done".
  1. Add scopes for the service account.
  • In the Google Admin Console (admin.google.com), go to Security.
  • Click on "Advanced Settings -> Manage API client access"
  • For the client name, use the Unique ID of the service account.
  • In the API Scopes, add the following scopes and click "Authorize".
https://www.googleapis.com/auth/admin.directory.orgunit,
https://www.googleapis.com/auth/admin.reports.usage.readonly,
https://www.googleapis.com/auth/classroom.announcements,
https://www.googleapis.com/auth/classroom.courses,
https://www.googleapis.com/auth/classroom.coursework.students,
https://www.googleapis.com/auth/classroom.guardianlinks.students,
https://www.googleapis.com/auth/classroom.profile.emails,
https://www.googleapis.com/auth/classroom.rosters,
https://www.googleapis.com/auth/classroom.student-submissions.students.readonly,
https://www.googleapis.com/auth/classroom.topics

Running the job

Build the Docker image

docker build -t google_classroom .

Run the job

docker run --rm -it google_classroom

Run the job using a database on localhost

docker run --rm -it --network host google_classroom

Run the job locally

pipenv install --skip-lock (first time)
pipenv run python main.py

Optional flags will include different types of pulls (can also be done via env variables):

  • --all (for pulling all data)
  • --usage
  • --courses
  • --topics
  • --coursework
  • --students
  • --teachers
  • --guardians
  • --submissions
  • --invites
  • --aliases
  • --invitations
  • --announcements

Use the flag --debug to turn on debug logging. Use the flag --debugfile to save raw json to a file for backup / auditing.

Running Tests

Tests are located in tests,py, and can be run with either of the following commands:

Locally:

pipenv run py.test -s -v tests.py

On Docker:

docker build -t google_classroom .
docker run --rm -it google_classroom --test

When making changes, please run tests to make sure you have not broken anything.

Yearly maintenance

  1. Confirm the org unit ID (used to get Student Usage) in the .env.

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.0%
  • Other 1.0%