Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: convert StreamedResultSet to Pandas Dataframe #226

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

Hardikr23
Copy link

@Hardikr23 Hardikr23 commented Feb 8, 2021

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #155 🦕

@Hardikr23 Hardikr23 requested a review from a team as a code owner February 8, 2021 07:22
@google-cla
Copy link

google-cla bot commented Feb 8, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no This human has *not* signed the Contributor License Agreement. label Feb 8, 2021
@product-auto-label product-auto-label bot added the api: spanner Issues related to the googleapis/python-spanner API. label Feb 8, 2021
@Hardikr23 Hardikr23 changed the title Feature to Convert StreamedResultSet to Pandas Dataframe feat :Convert StreamedResultSet to Pandas Dataframe Feb 8, 2021
@Hardikr23 Hardikr23 marked this pull request as draft February 8, 2021 07:29
@Hardikr23
Copy link
Author

@tswast Please review the code and let us know your feedback

@Hardikr23 Hardikr23 marked this pull request as ready for review February 8, 2021 07:33
@larkee larkee requested a review from tswast February 15, 2021 11:20
@larkee larkee changed the title feat :Convert StreamedResultSet to Pandas Dataframe feat: convert StreamedResultSet to Pandas Dataframe Feb 17, 2021
Copy link
Contributor

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing the following:

  • Unit tests -- show a DataFrame can be created successfully with 0 rows, 1 rows, many rows
  • System tests -- show that a DataFrame can be created from a real query. (Also with the emulator)
  • Update setup.py to document the "extras" (optional dependency) on pandas, including which versions of pandas are supported.

google/cloud/spanner_v1/_pandas_helpers.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/_pandas_helpers.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/_pandas_helpers.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/_pandas_helpers.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/_pandas_helpers.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/_pandas_helpers.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/streamed.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/streamed.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/_pandas_helpers.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/_pandas_helpers.py Outdated Show resolved Hide resolved
@google-cla
Copy link

google-cla bot commented Feb 26, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Feb 26, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Feb 26, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@Hardikr23
Copy link
Author

@googlebot I fixed it.

@google-cla
Copy link

google-cla bot commented Mar 2, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

Copy link
Contributor

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits re: license headers

tests/system/test_system_pandasDf.py Outdated Show resolved Hide resolved
tests/unit/test__pandas_helpers.py Outdated Show resolved Hide resolved
Co-authored-by: Tim Swast <swast@google.com>
@google-cla
Copy link

google-cla bot commented Mar 8, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

1 similar comment
@google-cla
Copy link

google-cla bot commented Mar 8, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

Co-authored-by: Tim Swast <swast@google.com>
@google-cla
Copy link

google-cla bot commented Mar 8, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

1 similar comment
@google-cla
Copy link

google-cla bot commented Mar 8, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Mar 10, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Mar 18, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

# See the License for the specific language governing permissions and
# limitations under the License.

from google.cloud import spanner_v1_mod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are failing because this import is incorrect.

Suggested change
from google.cloud import spanner_v1_mod
from google.cloud import spanner_v1

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Changed

if v.name == "TIMESTAMP" or v.name == "DATE":
try:
df[k] = df[k].to_datetime()
except Exception as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please use a more specific exception? Exception is way too broad.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please add some comments about why we have these two different methods of converting to datetime. I don't understand it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also also, please add back in the "localize" logic to the TIMESTAMP type. TIMESTAMP is intended to represent UTC datetimes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also also also, we need tests for the datetime64[ns, UTC] data type. I want to make sure the data type is the one we expect in the test (not just that we got a row and it didn't error)

For example, in pandas-gbq we test that the dataframe we got matches one that we manually created and localized in the test.

https://github.com/pydata/pandas-gbq/blob/46c579ac21879b431c8568b49e68624f4a5ea25e/tests/unit/test_timestamp.py

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also also, please add back in the "localize" logic to the TIMESTAMP type. TIMESTAMP is intended to represent UTC datetimes.

added

TABLE_NAME_1 = 'Functional_Alltypes'
COLUMNS_1 = ['id','bool_col','date','float_col','string_col','timestamp_col']
VALUES_1 = [
[1,True,'2016-02-09',2.2,'David','2002-02-10T15:30:00.45Z'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation looks off. Can you run nox -s blacken ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

VALUES = [[1, 'Alice'], [2, 'Bob']]


TABLE_NAME_1 = 'Functional_Alltypes'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the name from the BigQuery Ibis tests. Please use the existing all_types table from

CREATE TABLE all_types (
if possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Optional] In fact, we may be able to import test data from here:

ALL_TYPES_ROWDATA = (

For example, the DB-API tests import some useful test fixtures.

from .test_system import (
CREATE_INSTANCE,
EXISTING_INSTANCES,
INSTANCE_ID,
USE_EMULATOR,
_list_instances,
Config,
)

import pytest

# for referrence
TABLE_NAME = 'testTable'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reusing the contacts table would be more appropriate than creating our own.

CREATE TABLE contacts (

google/cloud/spanner_v1/_pandas_helpers.py Show resolved Hide resolved
"""Creates a pandas DataFrame of all rows in the result set

:rtype: pandas.DataFrame

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@google-cla
Copy link

google-cla bot commented Mar 25, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Mar 25, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@tswast tswast added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Mar 25, 2021
@google-cla
Copy link

google-cla bot commented Mar 25, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@tswast tswast added kokoro:force-run Add this label to force Kokoro to re-run the tests. and removed kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Mar 25, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Mar 25, 2021
@google-cla
Copy link

google-cla bot commented Mar 25, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

1 similar comment
@google-cla
Copy link

google-cla bot commented Mar 25, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Apr 7, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2021
@google-cla
Copy link

google-cla bot commented Apr 7, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@tswast tswast added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 7, 2021
@google-cla
Copy link

google-cla bot commented Apr 8, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@yaozhang09
Copy link

any update on this feature? ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the googleapis/python-spanner API. cla: no This human has *not* signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feature request: pandas connector
8 participants