Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Document BigQuery test dataset configuration #3273

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ title: Changelog

### Develop
* [BUGFIX] Fix deprecation warning for importing from collections (#3228)
* [DOCS] Document BigQuery test dataset configuration (#3273)

### 0.13.28
* [FEATURE] Implement ColumnPairValuesInSet metric for PandasExecutionEngine
Expand Down
7 changes: 3 additions & 4 deletions docs/contributing/contributing_test.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,17 +23,16 @@ In order to run BigQuery tests, you first need to go through the following steps

1. [Select or create a Cloud Platform project](https://console.cloud.google.com/project).
2. [Setup Authentication](https://googleapis.dev/python/google-api-core/latest/auth.html).
3. In your project, [create a BigQuery dataset](https://cloud.google.com/bigquery/docs/datasets) named `test_ci` and [set the dataset default table expiration](https://cloud.google.com/bigquery/docs/updating-datasets#table-expiration) to `.1` days
3. In your project, [create a BigQuery dataset](https://cloud.google.com/bigquery/docs/datasets) (e.g. named `test_ci`) and [set the dataset default table expiration](https://cloud.google.com/bigquery/docs/updating-datasets#table-expiration) to `.1` days

After setting up authentication, you can run with your project using the environment variable `GE_TEST_BIGQUERY_PROJECT`, e.g.
After setting up authentication, you can run with your project using the environment variables `GE_TEST_BIGQUERY_PROJECT` and `GE_TEST_BIGQUERY_DATASET`, e.g.

```bash
GE_TEST_BIGQUERY_PROJECT=<YOUR_GOOGLE_CLOUD_PROJECT>
GE_TEST_BIGQUERY_DATASET=test_ci
pytest tests/test_definitions/test_expectations_cfe.py --bigquery --no-spark --no-postgresql
```

Note that if you prefer to use a different dataset besides "test_ci", you can specify a different dataset with `GE_TEST_BIGQUERY_DATASET`.

### Writing unit and integration tests

Production code in Great Expectations must be thoroughly tested. In general, we insist on unit tests for all branches of every method, including likely error states. Most new feature contributions should include several unit tests. Contributions that modify or extend existing features should include a test of the new behavior.
Expand Down
9 changes: 4 additions & 5 deletions docs_rtd/contributing/testing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,20 +41,19 @@ In order to run BigQuery tests, you first need to go through the following steps

1. `Select or create a Cloud Platform project.`_
2. `Setup Authentication.`_
3. `In your project, create a BigQuery dataset named "test_ci"`_ and `set the dataset default table expiration to .1 days`_
3. `In your project, create a BigQuery dataset (e.g. named "test_ci")`_ and `set the dataset default table expiration to .1 days`_

.. _Select or create a Cloud Platform project.: https://console.cloud.google.com/project
.. _Setup Authentication.: https://googleapis.dev/python/google-api-core/latest/auth.html
.. _`In your project, create a BigQuery dataset named "test_ci"`: https://cloud.google.com/bigquery/docs/datasets
.. _`In your project, create a BigQuery dataset (e.g. named "test_ci")`: https://cloud.google.com/bigquery/docs/datasets
.. _`set the dataset default table expiration to .1 days`: https://cloud.google.com/bigquery/docs/updating-datasets#table-expiration

After setting up authentication, you can run with your project using the environment variable `GE_TEST_BIGQUERY_PROJECT`, e.g.
After setting up authentication, you can run with your project using the environment variables `GE_TEST_BIGQUERY_PROJECT` and `GE_TEST_BIGQUERY_DATASET`, e.g.

.. code-block::

GE_TEST_BIGQUERY_PROJECT=<YOUR_GOOGLE_CLOUD_PROJECT> pytest tests/test_definitions/test_expectations_cfe.py --bigquery --no-spark --no-postgresql -k bigquery
GE_TEST_BIGQUERY_PROJECT=<YOUR_GOOGLE_CLOUD_PROJECT> GE_TEST_BIGQUERY_DATASET=test_ci pytest tests/test_definitions/test_expectations_cfe.py --bigquery --no-spark --no-postgresql -k bigquery

Note that if you prefer to use a different dataset besides "test_ci", you can specify a different dataset with `GE_TEST_BIGQUERY_DATASET`.

Writing unit and integration tests
----------------------------------
Expand Down
9 changes: 7 additions & 2 deletions great_expectations/self_check/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -2006,10 +2006,15 @@ def _create_bigquery_engine() -> Engine:
gcp_project = os.getenv("GE_TEST_BIGQUERY_PROJECT")
if not gcp_project:
raise ValueError(
"Environment Variable GE_TEST_BIGQUERY_PROJECT is required to run expectation tests"
"Environment Variable GE_TEST_BIGQUERY_PROJECT is required to run BigQuery expectation tests"
)
return create_engine(f"bigquery://{gcp_project}/{_bigquery_dataset()}")


def _bigquery_dataset() -> str:
return os.getenv("GE_TEST_BIGQUERY_DATASET")
dataset = os.getenv("GE_TEST_BIGQUERY_DATASET")
if not dataset:
raise ValueError(
"Environment Variable GE_TEST_BIGQUERY_DATASET is required to run BigQuery expectation tests"
)
return dataset