Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to new BigQuery dataset access API #378

Closed
spbnick opened this issue Mar 7, 2023 · 16 comments · Fixed by #403
Closed

Switch to new BigQuery dataset access API #378

spbnick opened this issue Mar 7, 2023 · 16 comments · Fixed by #403
Assignees
Labels
good second issue Good for newcomers who're ready to try something harder

Comments

@spbnick
Copy link
Collaborator

spbnick commented Mar 7, 2023

We started receiving the following warning from Google Cloud APIs:

/home/nkondras/projects/github.com/kernelci/kcidb/kcidb/db/bigquery/v04_00.py:71: PendingDeprecationWarning: Client.dataset is deprecated and will be removed in a future version. Use a string like 'my_project.my_dataset' or a cloud.google.bigquery.DatasetReference object, instead.
self.dataset_ref = self.client.dataset(dataset_name)

Update the code to silence the warning and prepare for the removal of the old API.

This will very likely require us to make the name of the Google Cloud project mandatory in "database specifications" in the kcidb.db package, and all the code using it. In that case, notify the package users of the impeding change (or ask @spbnick to do that), change the code to require it, and then switch to the new API, getting rid of the warning.

@spbnick spbnick added the good second issue Good for newcomers who're ready to try something harder label Mar 7, 2023
@shivam-Purohit
Copy link

hey, can I work on this?

@spbnick
Copy link
Collaborator Author

spbnick commented Mar 9, 2023

Sure, go ahead, Shivam!

@shivam-Purohit
Copy link

Hey, where do I check for the changes I make? I still do not where to check for debugging purpose

@spbnick
Copy link
Collaborator Author

spbnick commented Mar 10, 2023

Can you register a Google Cloud account, and create a project for experiments? It's free for the first six months, but you would need a credit card. You can't really run this code easily without access to BigQuery, and that's only available in Google Cloud.

@shivam-Purohit
Copy link

That will be required I guess. I tried a couple of methods but didn't work sadly. Also, I do not own a credit card.

@spbnick
Copy link
Collaborator Author

spbnick commented Mar 13, 2023

OK, I created a Google Cloud project for you, called kcidb-shivam. Create a service account with "owner" role for the project, and then create and download a key for it. Try to use the Submitter Guide for help and as reference. Then you should be able to deploy KCIDB to that project.

@shivam-Purohit
Copy link

I have downloaded the gcp key and added the gcp credentials. What should I do next, should I create a playground like specified in the submitter guide

@spbnick
Copy link
Collaborator Author

spbnick commented Mar 16, 2023

Just try deploying to your new project. Make sure you've exported the GOOGLE_APPLICATION_CREDENTIALS variable, pointing to your credentials file, then execute something like this ./cloud deploy kcidb-shivam "" 1 -v --smtp-mocked --test --heavy-asserts.

@shivam-Purohit
Copy link

@spbnick is it deployed?
what should be my next course of action?
ubuntu  Running  - Oracle VM VirtualBox 18-03-2023 13_08_31

@spbnick
Copy link
Collaborator Author

spbnick commented Mar 20, 2023

Looks like it did deploy fine, indeed! Congrats 😀

OK, now you need to reproduce the warning. IIRC, the warnings should pop up when you'd run some database tests. Try this, and see if you get the warning the issue is talking about:

./cloud env kcidb-shivam "" 1 --smtp-mocked --test --heavy-asserts -- \
    pytest --tb=native --verbosity=2 kcidb/test_db.py

This runs the database tests emulating the cloud environment, and using the clean/empty test databases in the deployment you made on your screenshot.

Write, if you don't get the warnings, and I'll try to find a way to reproduce them.

If you get them, read the Google Cloud docs regarding that warning and the interface in particular. Search the web and browse the BigQuery Python API docs for details.

I was able to find the issue deprecating the old interface: googleapis/google-cloud-python#8989

It links to what seem to be a change updating the samples to the new interface: googleapis/python-bigquery#309

And here's the documentation of the class we're supposed to be using instead: https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.dataset.Dataset

@shivam-Purohit
Copy link

These command you specified does it need to be separated or together. I ran the command and it gave me this error


(env) shivam@ubuntu:~/Desktop/project-cloudkcidb/kcidb$ ./cloud env kcidb-shivam "" 1 --smtp-mocked --test --heavy-asserts --     pytest --tb=native --verbosity=2 kcidb/test_db.py
Invalid number of positional arguments
Usage: cloud env [OPTION...] PROJECT NAMESPACE [VERSION]
Output environment YAML used by KCIDB Cloud Functions.

If I run it separately all the tests passes.

kcidb/test_db.py::test_schemas_main <- ../../project/kcidb/kcidb/test_db.py PASSED                                                                              [  7%]
kcidb/test_db.py::test_init_main <- ../../project/kcidb/kcidb/test_db.py PASSED                                                                                 [ 15%]
kcidb/test_db.py::test_cleanup_main <- ../../project/kcidb/kcidb/test_db.py PASSED                                                                              [ 23%]
kcidb/test_db.py::test_empty_main <- ../../project/kcidb/kcidb/test_db.py PASSED                                                                                [ 30%]
kcidb/test_db.py::test_dump_main <- ../../project/kcidb/kcidb/test_db.py PASSED                                                                                 [ 38%]
kcidb/test_db.py::test_query_main <- ../../project/kcidb/kcidb/test_db.py PASSED                                                                                [ 46%]
kcidb/test_db.py::test_load_main <- ../../project/kcidb/kcidb/test_db.py PASSED                                                                                 [ 53%]
kcidb/test_db.py::test_get_last_modified[sqlite:/tmp/tmpw_mzyns1.sqlite3] <- ../../project/kcidb/kcidb/test_db.py PASSED                                        [ 61%]
kcidb/test_db.py::test_all_fields[sqlite:/tmp/tmpw_mzyns1.sqlite3] <- ../../project/kcidb/kcidb/test_db.py PASSED                                               [ 69%]
kcidb/test_db.py::test_query[sqlite:/tmp/tmpw_mzyns1.sqlite3] <- ../../project/kcidb/kcidb/test_db.py PASSED                                                    [ 76%]
kcidb/test_db.py::test_empty[sqlite:/tmp/tmpw_mzyns1.sqlite3] <- ../../project/kcidb/kcidb/test_db.py PASSED                                                    [ 84%]
kcidb/test_db.py::test_upgrade[sqlite:/tmp/tmpqf109q2j.sqlite3] <- ../../project/kcidb/kcidb/test_db.py PASSED                                                  [ 92%]
kcidb/test_db.py::test_cleanup[sqlite:/tmp/tmpqf109q2j.sqlite3] <- ../../project/kcidb/kcidb/test_db.py PASSED      

@spbnick
Copy link
Collaborator Author

spbnick commented Mar 20, 2023

These command you specified does it need to be separated or together. I ran the command and it gave me this error

Ah, yep, I made a mistake in the command. Please replace env with shell there.

If I run it separately all the tests passes.

The tests should pass in any case, but if you run them with deployed test databases, they will also run against BigQuery, and should produce the warnings.

@shivam-Purohit
Copy link

Got a tone of these including the big query one. I am happy the error occurred 😀

../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:121
  /home/shivam/.local/lib/python3.10/site-packages/pkg_resources/__init__.py:121: DeprecationWarning: pkg_resources is deprecated as an API
    warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)

../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870: 14 warnings
  /home/shivam/.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870
../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870
../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870
../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870
../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870
../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870
../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870
../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870
../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870
  /home/shivam/.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google.cloud')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2349
../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2349
../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2349
../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2349
  /home/shivam/.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2349: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(parent)

../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870
  /home/shivam/.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google.logging')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

../../../.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870
  /home/shivam/.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google.iam')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

../../../.local/lib/python3.10/site-packages/google/rpc/__init__.py:20
  /home/shivam/.local/lib/python3.10/site-packages/google/rpc/__init__.py:20: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google.rpc')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    pkg_resources.declare_namespace(__name__)

kcidb/db/bigquery/v04_00.py:73
kcidb/db/bigquery/v04_00.py:73
kcidb/db/bigquery/v04_00.py:73
kcidb/db/bigquery/v04_00.py:73
  /home/shivam/Desktop/pre-commit/kcidb/kcidb/db/bigquery/v04_00.py:73: PendingDeprecationWarning: Client.dataset is deprecated and will be removed in a future version. Use a string like 'my_project.my_dataset' or a cloud.google.bigquery.DatasetReference object, instead.
    self.dataset_ref = self.client.dataset(dataset_name)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

@spbnick
Copy link
Collaborator Author

spbnick commented Mar 20, 2023

Awesome! Please ignore the "Deprecated call to pkg_resources.declare_namespace('google.rpc')", this is not our problem, it seems.

Good luck figuring out what to do with the warning. Don't hesitate to reach out here in the comments, or on Slack, with any questions you might have.

@spbnick
Copy link
Collaborator Author

spbnick commented Mar 20, 2023

Please post the output of ./cloud env kcidb-shivam "" 1 --smtp-mocked --test --heavy-asserts on Slack (yes, env is correct this time) and I'll give you a faster command to run the tests reproducing the problem.

@shivam-Purohit
Copy link

I have shared the output. I will try figuring out things in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good second issue Good for newcomers who're ready to try something harder
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants