Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Still problem with struct type? #140

Closed
imartynetz opened this issue Jun 18, 2020 · 4 comments
Closed

Still problem with struct type? #140

imartynetz opened this issue Jun 18, 2020 · 4 comments
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: question Request for information or clarification. Not an issue.

Comments

@imartynetz
Copy link

imartynetz commented Jun 18, 2020

In this line it's blocking to send files using load_table_from_dataframe with struct type, in the link from github it's saying that this problem is fixed by the new version pyarrow. I have this problem when I'm trying upload a file from MongoDB using pymongo, and create a schema for a nested dictionary in one of the columns. I comment this lines, and the code runs, but when trying to sendo to BigQuery I got a error 500.

`pyarrow=='0.18.0.dev434' (I try with anothers versions same problem)

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Jun 18, 2020
@plamut plamut added the type: question Request for information or clarification. Not an issue. label Jun 19, 2020
@imartynetz
Copy link
Author

I manage to upload mongo file using pymongo and with one of the columns create with it have a nested dictionary. Initially I still got an google.api_core.exceptions.InternalServerError: 500 An internal error occurred and the request could not be completed. Error: 3144498 so i try to send in chunks of 15k, the DataFrame have shape of (90755 x 41), it send 45k rows, and the rest is giving me this error 500 yet. I try using parquet and using job_config.source_format = bigquery.SourceFormat.AVRO both giving the same results. Any hint what's giving this error?

@HemangChothani HemangChothani self-assigned this Jul 2, 2020
@HemangChothani
Copy link
Contributor

@imartynetz The issue was related to pyarrow and it has been resolved in version 0.17.0 but it doesn't support Python2 whereas bigquery still support and will drop Python 2 support very soon. Here is the PR #146 which will be merged when bigquery will end python2 support.

@HemangChothani
Copy link
Contributor

Closing this issue as resolved by PR #146

@imartynetz
Copy link
Author

imartynetz commented Aug 4, 2020

@HemangChothani Sorry, i forgot to answer, I'm using python 3.5.4 to make this. I was able to upload nested dictionary if I create a local .parquet file and submit to gcloud storage and them i use this file in the storage to create bigquery table, directly from load_table_from_dataframe was able to create the table but with all nested dictionary empty.
Other fun think. If I have a simple dictionary i can upload, but with a nested dictionary doesn't, i need first to create a list with this nested dictionary, then can be uploaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

3 participants