Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multi statement select #1047

Open
aczire opened this issue Mar 22, 2024 · 1 comment
Open

Support for multi statement select #1047

aczire opened this issue Mar 22, 2024 · 1 comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API.

Comments

@aczire
Copy link

aczire commented Mar 22, 2024

Please find below minimum reproducible code to run a muti statement select. The problem here is, Big Query runs the select statement as multiple statements, and the result contains dataset from the the first dataset only, everything else is discarded.

You can run this in a Big Query console and see the difference.

How can I get the result for all the statements executed.

from google.cloud import bigquery


def run_query(sql_query):
    client = bigquery.Client()
    query_job = client.query(sql_query)

    # Wait for the job to complete
    results = query_job.result()
    print("Query complete!")
    print(f"Results: {results.total_rows}")

    for row in results:
        print(row)


if __name__ == "__main__":

    sql_query = """
        BEGIN
            FOR record IN 
            (
            SELECT num FROM UNNEST(GENERATE_ARRAY(1, 5)) AS num
            )
            DO
            WITH numbers AS (
                SELECT num
                FROM UNNEST(GENERATE_ARRAY(1, 100)) AS num  -- Adjust the range as needed
            )
            SELECT num
            FROM numbers
            LIMIT 10;
            END FOR;
        END
    """

    # Execute the demo query
    run_query(sql_query)


Note: Th equery is just for demonstrative purpose of multi statement execution only and the correctness of its functionality is never intended.

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. label Mar 22, 2024
@aczire
Copy link
Author

aczire commented Mar 22, 2024

It seems the child statements are child jobs and the result for them are available from the child job result object.

Now the question is how to make it work with python-bigquery-sqlalchemy?

    client = bigquery.Client()
    query_job = client.query(sql_query)

    # Wait for the job to complete
    results = query_job.result()
    print("Query complete!")
    print(f"Results: {results.total_rows}")

    for row in results:
        print(row)

    print("Parent Job ID: ", query_job.job_id)

    all_statements = client.list_jobs(parent_job=query_job.job_id)

    for child_job in all_statements:
        job: QueryJob = child_job
        print(f"{job.job_id}, {job.state}")
        results = query_job.result()
        print(f"Child job results: {results.total_rows}")

        for row in results:
            print(row)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API.
Projects
None yet
Development

No branches or pull requests

1 participant