Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add code samples for Jupyter/IPython magics #1013

Merged
merged 7 commits into from Oct 29, 2021
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
29 changes: 29 additions & 0 deletions docs/magics.rst
@@ -1,5 +1,34 @@
IPython Magics for BigQuery
===========================

To use these magics, you must first register them. Run the ``%load_ext`` magic
in a Jupyter notebook cell.

.. code::

%load_ext google.cloud.bigquery

This makes the ``%%bigquery`` magic available.

Code Samples
------------

Running a query:

.. literalinclude:: ./samples/magics/query.py
:dedent: 4
:start-after: [START bigquery_jupyter_query]
:end-before: [END bigquery_jupyter_query]

Running a parameterized query:

.. literalinclude:: ./samples/magics/query_params_scalars.py
:dedent: 4
:start-after: [START bigquery_jupyter_query_params_scalars]
:end-before: [END bigquery_jupyter_query_params_scalars]

API Reference
-------------

.. automodule:: google.cloud.bigquery.magics.magics
:members:
66 changes: 0 additions & 66 deletions google/cloud/bigquery/magics/magics.py
Expand Up @@ -14,15 +14,6 @@

"""IPython Magics

To use these magics, you must first register them. Run the ``%load_ext`` magic
in a Jupyter notebook cell.

.. code::

%load_ext google.cloud.bigquery

This makes the ``%%bigquery`` magic available.

.. function:: %%bigquery

IPython cell magic to run a query and display the result as a DataFrame
Expand Down Expand Up @@ -85,63 +76,6 @@
.. note::
All queries run using this magic will run using the context
:attr:`~google.cloud.bigquery.magics.Context.credentials`.

Examples:
The following examples can be run in an IPython notebook after loading
the bigquery IPython extension (see ``In[1]``) and setting up
Application Default Credentials.

.. code-block:: none

In [1]: %load_ext google.cloud.bigquery

In [2]: %%bigquery
...: SELECT name, SUM(number) as count
...: FROM `bigquery-public-data.usa_names.usa_1910_current`
...: GROUP BY name
...: ORDER BY count DESC
...: LIMIT 3

Out[2]: name count
...: -------------------
...: 0 James 4987296
...: 1 John 4866302
...: 2 Robert 4738204

In [3]: %%bigquery df --project my-alternate-project --verbose
...: SELECT name, SUM(number) as count
...: FROM `bigquery-public-data.usa_names.usa_1910_current`
...: WHERE gender = 'F'
...: GROUP BY name
...: ORDER BY count DESC
...: LIMIT 3
Executing query with job ID: bf633912-af2c-4780-b568-5d868058632b
Query executing: 2.61s
Query complete after 2.92s

In [4]: df

Out[4]: name count
...: ----------------------
...: 0 Mary 3736239
...: 1 Patricia 1568495
...: 2 Elizabeth 1519946

In [5]: %%bigquery --params {"num": 17}
...: SELECT @num AS num

Out[5]: num
...: -------
...: 0 17

In [6]: params = {"num": 17}

In [7]: %%bigquery --params $params
...: SELECT @num AS num

Out[7]: num
...: -------
...: 0 17
"""

from __future__ import print_function
Expand Down
3 changes: 2 additions & 1 deletion noxfile.py
Expand Up @@ -186,8 +186,9 @@ def snippets(session):
session.run(
"py.test",
"samples",
"--ignore=samples/snippets",
"--ignore=samples/magics",
"--ignore=samples/geography",
"--ignore=samples/snippets",
*session.posargs,
)

Expand Down
13 changes: 13 additions & 0 deletions samples/magics/__init__.py
@@ -0,0 +1,13 @@
# Copyright 2021 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
21 changes: 21 additions & 0 deletions samples/magics/_helpers.py
@@ -0,0 +1,21 @@
# Copyright 2021 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


def strip_region_tags(sample_text):
"""Remove blank lines and region tags from sample text"""
magic_lines = [
line for line in sample_text.split("\n") if len(line) > 0 and "# [" not in line
]
return "\n".join(magic_lines)
36 changes: 36 additions & 0 deletions samples/magics/conftest.py
@@ -0,0 +1,36 @@
# Copyright 2021 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import pytest

interactiveshell = pytest.importorskip("IPython.terminal.interactiveshell")
tools = pytest.importorskip("IPython.testing.tools")


@pytest.fixture(scope="session")
def ipython():
config = tools.default_config()
config.TerminalInteractiveShell.simple_prompt = True
shell = interactiveshell.TerminalInteractiveShell.instance(config=config)
return shell


@pytest.fixture(autouse=True)
def ipython_interactive(ipython):
"""Activate IPython's builtin hooks

for the duration of the test scope.
"""
with ipython.builtin_trap:
yield ipython
147 changes: 147 additions & 0 deletions samples/magics/jupyter_tutorial_test.py
@@ -0,0 +1,147 @@
# Copyright 2018 Google Inc. All Rights Reserved.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware that this uses the natality dataset, which I think is on our list to migrate away from, but this is just to move the sample into a different directory from here: https://github.com/googleapis/python-bigquery/blob/main/samples/snippets/jupyter_tutorial_test.py

Samples are used here: https://cloud.google.com/bigquery/docs/visualize-jupyter

Possible we'll want to make this a regular notebook instead and put it here: https://github.com/GoogleCloudPlatform/bigquery-notebooks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can go ahead with existing samples using natality as long as we circle back to it eventually.

I think that this would be a good candidate for a notebook when we have the embedded notebook pipeline set up. (Possibly Q4.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@loferris Perhaps I should remove this file from this PR? I was moving it to this directory mostly so I can cleanup the requirements.txt file in samples/snippets, but if we think we'll migrate this to a proper notebook soon-ish, then I can wait to clean that up until then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine as is!

#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""All of the samples used in the Jupyter notebooks tutorial.

Written as a test to save on boilerplate, since this sample has to similate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good workaround for this kind of doc, but I would be interested in converting to a notebook as writing a sample as a test seems counterintuitive to maintain and fairly different from other samples of this type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on eventually converting to a notebook and integrating into a notebook->docs pipeline

running code from Jupyter notebook cells.
"""


import pytest

from . import _helpers

IPython = pytest.importorskip("IPython")
matplotlib = pytest.importorskip("matplotlib")

# Ignore semicolon lint warning because semicolons are used in notebooks
# flake8: noqa E703


def test_jupyter_tutorial(ipython):
matplotlib.use("agg")
ip = IPython.get_ipython()
ip.extension_manager.load_extension("google.cloud.bigquery")

sample = """
# [START bigquery_jupyter_magic_gender_by_year]
%%bigquery
SELECT
source_year AS year,
COUNT(is_male) AS birth_count
FROM `bigquery-public-data.samples.natality`
GROUP BY year
ORDER BY year DESC
LIMIT 15
# [END bigquery_jupyter_magic_gender_by_year]
"""
result = ip.run_cell(_helpers.strip_region_tags(sample))
result.raise_error() # Throws an exception if the cell failed.

sample = """
# [START bigquery_jupyter_magic_gender_by_year_var]
%%bigquery total_births
SELECT
source_year AS year,
COUNT(is_male) AS birth_count
FROM `bigquery-public-data.samples.natality`
GROUP BY year
ORDER BY year DESC
LIMIT 15
# [END bigquery_jupyter_magic_gender_by_year_var]
"""
result = ip.run_cell(_helpers.strip_region_tags(sample))
result.raise_error() # Throws an exception if the cell failed.

assert "total_births" in ip.user_ns # verify that variable exists
total_births = ip.user_ns["total_births"]
# [START bigquery_jupyter_plot_births_by_year]
total_births.plot(kind="bar", x="year", y="birth_count")
# [END bigquery_jupyter_plot_births_by_year]

sample = """
# [START bigquery_jupyter_magic_gender_by_weekday]
%%bigquery births_by_weekday
SELECT
wday,
SUM(CASE WHEN is_male THEN 1 ELSE 0 END) AS male_births,
SUM(CASE WHEN is_male THEN 0 ELSE 1 END) AS female_births
FROM `bigquery-public-data.samples.natality`
WHERE wday IS NOT NULL
GROUP BY wday
ORDER BY wday ASC
# [END bigquery_jupyter_magic_gender_by_weekday]
"""
result = ip.run_cell(_helpers.strip_region_tags(sample))
result.raise_error() # Throws an exception if the cell failed.

assert "births_by_weekday" in ip.user_ns # verify that variable exists
births_by_weekday = ip.user_ns["births_by_weekday"]
# [START bigquery_jupyter_plot_births_by_weekday]
births_by_weekday.plot(x="wday")
# [END bigquery_jupyter_plot_births_by_weekday]

# [START bigquery_jupyter_import_and_client]
from google.cloud import bigquery

client = bigquery.Client()
# [END bigquery_jupyter_import_and_client]

# [START bigquery_jupyter_query_plurality_by_year]
sql = """
SELECT
plurality,
COUNT(1) AS count,
year
FROM
`bigquery-public-data.samples.natality`
WHERE
NOT IS_NAN(plurality) AND plurality > 1
GROUP BY
plurality, year
ORDER BY
count DESC
"""
df = client.query(sql).to_dataframe()
df.head()
# [END bigquery_jupyter_query_plurality_by_year]

# [START bigquery_jupyter_plot_plurality_by_year]
pivot_table = df.pivot(index="year", columns="plurality", values="count")
pivot_table.plot(kind="bar", stacked=True, figsize=(15, 7))
# [END bigquery_jupyter_plot_plurality_by_year]

# [START bigquery_jupyter_query_births_by_gestation]
sql = """
SELECT
gestation_weeks,
COUNT(1) AS count
FROM
`bigquery-public-data.samples.natality`
WHERE
NOT IS_NAN(gestation_weeks) AND gestation_weeks <> 99
GROUP BY
gestation_weeks
ORDER BY
gestation_weeks
"""
df = client.query(sql).to_dataframe()
# [END bigquery_jupyter_query_births_by_gestation]

# [START bigquery_jupyter_plot_births_by_gestation]
ax = df.plot(kind="bar", x="gestation_weeks", y="count", figsize=(15, 7))
ax.set_title("Count of Births by Gestation Weeks")
ax.set_xlabel("Gestation Weeks")
ax.set_ylabel("Count")
# [END bigquery_jupyter_plot_births_by_gestation]