Skip to content

Writing a Query Snippet

Shahin Saadati edited this page Apr 12, 2022 · 10 revisions

Query Snippets are a great way to uncover hidden patterns and insights in a dataset, while showing different features and capabilities of BigQuery.

The steps to add a query snippet to the repository are pretty straightforward:

Choose a Dataset

Make sure the dataset you choose is tabular and onboarded by our team. There should be a directory available for that dataset here.

Write your Query Snippet

Your query snippet can be anything you want, as long as it shows something interesting about the dataset. So, a simple SELECT * FROM table is not an ideal snippet. Some ideas to explore may be:

  • Correlation/Causation Analysis
  • Time Series Analysis
  • BQML
  • Trends

Development Tips

  • Write and edit your query in BigQuery Console.
  • For large queries, please use Table Sampling.
  • Once your query is ready, save it in a .sql file and add some comments above your query (see this example).

Provide Metadata

For each query snippet, you'll need to provide a number of metadata in an artifact.yaml file. Here is a sample of how the file should look like:

artifact:
  title: The Title for the Query Snippet
  description: A brief description of that the query snippet shows.
  tier:free

[Optional] Detailed Description

If you want, you may provide an overview.md file with more details about your query snippet and include it alongside your other files for your query snippet.

Submit the Code

Your code needs to be submitted for review via a Pull Request. Here is a guideline to show how to do it.

Example

Let's assume you wrote a query snippet for austin_bikeshare dataset which find the high bike traffic areas in Austin and save it in a file called austin_bike_traffic.sql.

For each dataset, all query snippets should be placed in a new subdirectory under .../docs/queries. For this example, create a subdirectory called bike_traffic and place both austin_bike_traffic.sql and artifact.yaml inside it. So, the final tree structure should look something like this:

├── datasets
└── austin_bikeshare
    └── docs  
        └── queries
            └── bike_traffic
                ├── austin_bike_traffic.sql
                └── artifact.yaml