Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue161/BigQuery BIGNUMERIC column error #162

Merged
merged 22 commits into from Jan 6, 2021

Conversation

dhercher
Copy link
Collaborator

@dhercher dhercher commented Dec 9, 2020

The BIGNUMERIC datatype causes an error to be thrown in Ibis when getting schema. We can easily resolve this with the code supplied here to add Ibis support for BIGNUMERIC.

However, ideally the field should be defined as a numeric type. This would allow numeric aggregations and functions o be applied in Ibis. When we attempt to use dt.Numeric(38,9) then we get an error from the BigQuery client when executing the query.

We are suggesting a temporary dt.string type to unblock schema conversion. However, this does not resolve the issue which will still crop up

data_validation/clients.py Outdated Show resolved Hide resolved
.coveragerc Outdated Show resolved Hide resolved
data_validation/clients.py Outdated Show resolved Hide resolved
third_party/ibis/ibis_addon/datatypes.py Outdated Show resolved Hide resolved
third_party/ibis/ibis_addon/datatypes.py Show resolved Hide resolved
# TODO(dhercher): DataType customizations shoould be moved to Ibis/third_party
# BigQuery BIGNUMERIC support needs to be pushed to Ibis
bigquery._pandas_helpers.BQ_TO_ARROW_SCALARS["BIGNUMERIC"] = pyarrow.string
_DTYPE_TO_IBIS_TYPE["BIGNUMERIC"] = dt.string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you describe why BIGNUMERIC and NUMERIC are handled differently?

Also, I suspect you want float64 because Ibis has more aggregate functions implemented there than they do for the decimal data types? Could you comment on why we need to remap it here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An error occurs with float when it is attempted to be read to pandas. I believe this needs to wait until pyarrow. and full support is added too

@dhercher dhercher requested a review from tswast January 3, 2021 22:10
@classmethod
def to_ibis_from_I2(cls, col_data, return_ibis_type=True):
if return_ibis_type:
return dt.int8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the difference between to_ibis_from_I1 and to_ibis_from_I2

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I1 and I2 are the teradata datatypes for different int values, both are encapsulated by int8 though

@dhercher dhercher merged commit 357927e into develop Jan 6, 2021
@dhercher dhercher deleted the issue161-support-bigquery-bignumeric branch January 6, 2021 22:43
@freedomofnet freedomofnet added Beta priority: p0 Highest priority. Critical issue. Will be fixed prior to next release. labels Jan 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p0 Highest priority. Critical issue. Will be fixed prior to next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants