New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: add code samples for Jupyter/IPython magics #1013
Changes from 3 commits
367671a
4eca7d5
97d17f7
8dbad0c
cb1d753
e28ce7e
0cf327c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,34 @@ | ||
IPython Magics for BigQuery | ||
=========================== | ||
|
||
To use these magics, you must first register them. Run the ``%load_ext`` magic | ||
in a Jupyter notebook cell. | ||
|
||
.. code:: | ||
|
||
%load_ext google.cloud.bigquery | ||
|
||
This makes the ``%%bigquery`` magic available. | ||
|
||
Code Samples | ||
------------ | ||
|
||
Running a query: | ||
|
||
.. literalinclude:: ./samples/magics/query.py | ||
:dedent: 4 | ||
:start-after: [START bigquery_jupyter_query] | ||
:end-before: [END bigquery_jupyter_query] | ||
|
||
Running a parameterized query: | ||
|
||
.. literalinclude:: ./samples/magics/query_params_scalars.py | ||
:dedent: 4 | ||
:start-after: [START bigquery_jupyter_query_params_scalars] | ||
:end-before: [END bigquery_jupyter_query_params_scalars] | ||
|
||
API Reference | ||
------------- | ||
|
||
.. automodule:: google.cloud.bigquery.magics.magics | ||
:members: |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# Copyright 2021 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# https://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
# Copyright 2021 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# https://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
def strip_region_tags(sample_text): | ||
"""Remove blank lines and region tags from sample text""" | ||
magic_lines = [ | ||
line for line in sample_text.split("\n") if len(line) > 0 and "# [" not in line | ||
] | ||
return "\n".join(magic_lines) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# Copyright 2021 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# https://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import pytest | ||
|
||
interactiveshell = pytest.importorskip("IPython.terminal.interactiveshell") | ||
tools = pytest.importorskip("IPython.testing.tools") | ||
|
||
|
||
@pytest.fixture(scope="session") | ||
def ipython(): | ||
config = tools.default_config() | ||
config.TerminalInteractiveShell.simple_prompt = True | ||
shell = interactiveshell.TerminalInteractiveShell.instance(config=config) | ||
return shell | ||
|
||
|
||
@pytest.fixture(autouse=True) | ||
def ipython_interactive(ipython): | ||
"""Activate IPython's builtin hooks | ||
|
||
for the duration of the test scope. | ||
""" | ||
with ipython.builtin_trap: | ||
yield ipython |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,147 @@ | ||
# Copyright 2018 Google Inc. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
"""All of the samples used in the Jupyter notebooks tutorial. | ||
|
||
Written as a test to save on boilerplate, since this sample has to similate | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems like a good workaround for this kind of doc, but I would be interested in converting to a notebook as writing a sample as a test seems counterintuitive to maintain and fairly different from other samples of this type. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 on eventually converting to a notebook and integrating into a notebook->docs pipeline |
||
running code from Jupyter notebook cells. | ||
""" | ||
|
||
|
||
import pytest | ||
|
||
from . import _helpers | ||
|
||
IPython = pytest.importorskip("IPython") | ||
matplotlib = pytest.importorskip("matplotlib") | ||
|
||
# Ignore semicolon lint warning because semicolons are used in notebooks | ||
# flake8: noqa E703 | ||
|
||
|
||
def test_jupyter_tutorial(ipython): | ||
matplotlib.use("agg") | ||
ip = IPython.get_ipython() | ||
ip.extension_manager.load_extension("google.cloud.bigquery") | ||
|
||
sample = """ | ||
# [START bigquery_jupyter_magic_gender_by_year] | ||
%%bigquery | ||
SELECT | ||
source_year AS year, | ||
COUNT(is_male) AS birth_count | ||
FROM `bigquery-public-data.samples.natality` | ||
GROUP BY year | ||
ORDER BY year DESC | ||
LIMIT 15 | ||
# [END bigquery_jupyter_magic_gender_by_year] | ||
""" | ||
result = ip.run_cell(_helpers.strip_region_tags(sample)) | ||
result.raise_error() # Throws an exception if the cell failed. | ||
|
||
sample = """ | ||
# [START bigquery_jupyter_magic_gender_by_year_var] | ||
%%bigquery total_births | ||
SELECT | ||
source_year AS year, | ||
COUNT(is_male) AS birth_count | ||
FROM `bigquery-public-data.samples.natality` | ||
GROUP BY year | ||
ORDER BY year DESC | ||
LIMIT 15 | ||
# [END bigquery_jupyter_magic_gender_by_year_var] | ||
""" | ||
result = ip.run_cell(_helpers.strip_region_tags(sample)) | ||
result.raise_error() # Throws an exception if the cell failed. | ||
|
||
assert "total_births" in ip.user_ns # verify that variable exists | ||
total_births = ip.user_ns["total_births"] | ||
# [START bigquery_jupyter_plot_births_by_year] | ||
total_births.plot(kind="bar", x="year", y="birth_count") | ||
# [END bigquery_jupyter_plot_births_by_year] | ||
|
||
sample = """ | ||
# [START bigquery_jupyter_magic_gender_by_weekday] | ||
%%bigquery births_by_weekday | ||
SELECT | ||
wday, | ||
SUM(CASE WHEN is_male THEN 1 ELSE 0 END) AS male_births, | ||
SUM(CASE WHEN is_male THEN 0 ELSE 1 END) AS female_births | ||
FROM `bigquery-public-data.samples.natality` | ||
WHERE wday IS NOT NULL | ||
GROUP BY wday | ||
ORDER BY wday ASC | ||
# [END bigquery_jupyter_magic_gender_by_weekday] | ||
""" | ||
result = ip.run_cell(_helpers.strip_region_tags(sample)) | ||
result.raise_error() # Throws an exception if the cell failed. | ||
|
||
assert "births_by_weekday" in ip.user_ns # verify that variable exists | ||
births_by_weekday = ip.user_ns["births_by_weekday"] | ||
# [START bigquery_jupyter_plot_births_by_weekday] | ||
births_by_weekday.plot(x="wday") | ||
# [END bigquery_jupyter_plot_births_by_weekday] | ||
|
||
# [START bigquery_jupyter_import_and_client] | ||
from google.cloud import bigquery | ||
|
||
client = bigquery.Client() | ||
# [END bigquery_jupyter_import_and_client] | ||
|
||
# [START bigquery_jupyter_query_plurality_by_year] | ||
sql = """ | ||
SELECT | ||
plurality, | ||
COUNT(1) AS count, | ||
year | ||
FROM | ||
`bigquery-public-data.samples.natality` | ||
WHERE | ||
NOT IS_NAN(plurality) AND plurality > 1 | ||
GROUP BY | ||
plurality, year | ||
ORDER BY | ||
count DESC | ||
""" | ||
df = client.query(sql).to_dataframe() | ||
df.head() | ||
# [END bigquery_jupyter_query_plurality_by_year] | ||
|
||
# [START bigquery_jupyter_plot_plurality_by_year] | ||
pivot_table = df.pivot(index="year", columns="plurality", values="count") | ||
pivot_table.plot(kind="bar", stacked=True, figsize=(15, 7)) | ||
# [END bigquery_jupyter_plot_plurality_by_year] | ||
|
||
# [START bigquery_jupyter_query_births_by_gestation] | ||
sql = """ | ||
SELECT | ||
gestation_weeks, | ||
COUNT(1) AS count | ||
FROM | ||
`bigquery-public-data.samples.natality` | ||
WHERE | ||
NOT IS_NAN(gestation_weeks) AND gestation_weeks <> 99 | ||
GROUP BY | ||
gestation_weeks | ||
ORDER BY | ||
gestation_weeks | ||
""" | ||
df = client.query(sql).to_dataframe() | ||
# [END bigquery_jupyter_query_births_by_gestation] | ||
|
||
# [START bigquery_jupyter_plot_births_by_gestation] | ||
ax = df.plot(kind="bar", x="gestation_weeks", y="count", figsize=(15, 7)) | ||
ax.set_title("Count of Births by Gestation Weeks") | ||
ax.set_xlabel("Gestation Weeks") | ||
ax.set_ylabel("Count") | ||
# [END bigquery_jupyter_plot_births_by_gestation] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm aware that this uses the natality dataset, which I think is on our list to migrate away from, but this is just to move the sample into a different directory from here: https://github.com/googleapis/python-bigquery/blob/main/samples/snippets/jupyter_tutorial_test.py
Samples are used here: https://cloud.google.com/bigquery/docs/visualize-jupyter
Possible we'll want to make this a regular notebook instead and put it here: https://github.com/GoogleCloudPlatform/bigquery-notebooks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can go ahead with existing samples using natality as long as we circle back to it eventually.
I think that this would be a good candidate for a notebook when we have the embedded notebook pipeline set up. (Possibly Q4.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@loferris Perhaps I should remove this file from this PR? I was moving it to this directory mostly so I can cleanup the
requirements.txt
file insamples/snippets
, but if we think we'll migrate this to a proper notebook soon-ish, then I can wait to clean that up until then.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine as is!