oso

Open Source Observer is a free analytics suite that helps funders measure the impact of open source software contributions to the health of their ecosystem.

opensource.observer

Organization

/apps: The OSO apps
- /docs: documentation (Docusaurus)
  - on Vercel - Production build
- /frontend: frontend application (Next.js)
  - on Vercel - Production build
/warehouse: All code specific to the data warehouse
- /dbt: dbt configuration
- /cloudquery-*: cloudquery plugins (there are many)
- Also contains other tools to manage warehouse pipelines
/ops: Our ops related code (mostly terraform)

Frontend Quickstart

Setup and build the frontend

First, make sure the environment variables are set for ./apps/frontend. Take a look at ./apps/frontend/.env.local.example for the complete list.

You can either set these yourself (e.g. in CI/CD)
or copy the file to .env.local and populate it.

Then do a turbo build of all apps, run the following:

pnpm install
pnpm build

The resulting static site can be found in ./build/.

Running the prod server

If you've already run the build, you can use pnpm serve to serve the built files

Running the frontend dev server

To run a dev server that watches for changes across code and Plasmic, run:

pnpm dev:frontend

Playbooks

For setup and common operations for each subproject, navigate into the respective directory and check out the README.md.

dbt Start

Our dataset are public! If you'd like to use them as opposed to adding to our dbt models, checkout our docs!

Setting up

Prequisites

Python >=3.11
poetry >= 1.8
- Install with pipx: pipx install poetry

Install dependencies

From inside the root directory, run poetry to install the dependencies.

$ poetry install

Using the poetry environment

Once installation has completed you can enter the poetry environment.

$ poetry shell

From here you should have dbt on your path.

$ which dbt

This should return something like opensource-observer/oso/.venv/bin/dbt

Authenticating to bigquery

If you have write access to the dataset then you can connect to it by setting the opensource_observer profile in dbt. Inside ~/.dbt/profiles.yml (create it if it isn't there), add the following:

opensource_observer:
  outputs:
    production:
      type: bigquery
      dataset: oso
      job_execution_time_seconds: 300
      job_retries: 1
      location: US
      method: oauth
      project: opensource-observer
      threads: 32
    playground:
      type: bigquery
      dataset: oso_playground
      job_execution_time_seconds: 300
      job_retries: 1
      location: US
      method: oauth
      project: opensource-observer
      threads: 32
  # By default we target the playground. it's less costly and also safer to write
  # there while developing
  target: playground

If you don't have gcloud installed you'll need to do so as well. The instructions are here.

For macOS users: Instructions can be a bit clunky if you're on macOS, so we suggest using homebrew like this:

$ brew install --cask google-cloud-sdk

Finally, authenticate to google run the following (a browser window will pop up after this so be sure to come back to the docs after you've completed the login):

$ gcloud auth application-default login

You'll need to do this once an hour. This is simplest to setup but can be a pain as you need to regularly reauth. If you need longer access you can setup a service-account in GCP, but these docs will not cover that for now.

You should now be logged into BigQuery!

Setting up VS Code

The Power User for dbt core extension is pretty helpful.

You'll need the path to your poetry environment, which you can get by running

poetry env info --path

Then in VS Code:

Install the extension
Open the command pallet, enter "Python: select interpreter"
Select "Enter interpreter path..."
Enter the path from the poetry command above

Check that you have a little check mark next to "dbt" in the bottom bar.

Usage

Once you've updated any models you can run dbt within the poetry environment by simply calling:

$ dbt run

Note: If you configured the dbt profile as shown in this document, this dbt run will write to the opensource-observer.oso_playground dataset.

It is likely best to target a specific model so things don't take so long on some of our materializations:

$ dbt run --select {name_of_the_model}

Name		Name	Last commit message	Last commit date
Latest commit History 633 Commits
.github		.github
.husky		.husky
apps		apps
docker		docker
ops		ops
warehouse		warehouse
.dockerignore		.dockerignore
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.lintstagedrc		.lintstagedrc
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
.sqlfluff		.sqlfluff
.sqlfluffignore		.sqlfluffignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
dbt_project.yml		dbt_project.yml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
tsconfig.json		tsconfig.json
turbo.json		turbo.json

License

opensource-observer/oso

Folders and files

Latest commit

History

Repository files navigation

oso

Organization

Frontend Quickstart

Setup and build the frontend

Running the prod server

Running the frontend dev server

Playbooks

dbt Start

Setting up

Prequisites

Install dependencies

Using the poetry environment

Authenticating to bigquery

Setting up VS Code

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Languages