Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor API data-access pattern to only load what is necessary; use prepared statements #12

Merged
merged 6 commits into from Apr 17, 2024

Conversation

ohaibbq
Copy link

@ohaibbq ohaibbq commented Apr 11, 2024

This PR greatly improves the performance of the emulator API endpoints.

We no longer load the entire BigQuery project (jobs, datasets, tables, etc) on each request. This was slow for a number of reasons as outlined in goccy#294

We also now utilize unformatted SQLite queries. Previously, go-zetasqlite would rewrite the metadata repository queries to use functions like zetasqlite_equals instead of operators like =, which meant that SQLite would need to do full table scans when it could be doing more performant query plans.

Many endpoints now take tens of microseconds to return instead of 100+ms. Table creation takes roughly ~10ms whereas before we were seeing upwards of 300ms.

POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs   {"query": "prettyPrint=false"}
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs took 16.821342ms
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views       {"query": "prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views took 287.595µs
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs   {"query": "prettyPrint=false"}
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs took 19.376201ms
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/c0150a30-f7c6-4bbf-b9e3-a00dd494f14d    {"query": "maxResults=0&location=US&prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/c0150a30-f7c6-4bbf-b9e3-a00dd494f14d took 791.9µs
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/d1e45499-8eb2-4417-bf5d-af0c1266503a    {"query": "maxResults=0&location=US&prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/d1e45499-8eb2-4417-bf5d-af0c1266503a took 645.539µs
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/outliers_staff_count_percent_change_materialized       {"query": "prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/outliers_staff_count_percent_change_materialized took 1.660463ms
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/current_supervision_staff_missing_district_materialized        {"query": "prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/current_supervision_staff_missing_district_materialized took 1.354469ms
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables       {"query": "prettyPrint=false"}
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables took 13.399834ms
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views       {"query": "prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views took 298.365µs
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs   {"query": "prettyPrint=false"}
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs took 16.414274ms
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/55f3fc1a-05c2-41a5-88dd-0f9f592f4a1c    {"query": "maxResults=0&location=US&prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/55f3fc1a-05c2-41a5-88dd-0f9f592f4a1c took 458.882µs
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/outliers_staff_count_percent_change_errors_materialized        {"query": "prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/outliers_staff_count_percent_change_errors_materialized took 1.560456ms

We are now able to do a run of recidiviz.tools.deploy.deploy_empty_test_views creating & materializing 1,400 views
against the emulator in ~35 seconds.

@ohaibbq ohaibbq merged commit 61539ca into candidate/rb20240328 Apr 17, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants