Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilize Bigquery Storage API #71

Open
smith-m opened this issue Feb 23, 2019 · 2 comments
Open

Utilize Bigquery Storage API #71

smith-m opened this issue Feb 23, 2019 · 2 comments

Comments

@smith-m
Copy link

smith-m commented Feb 23, 2019

Beta officially announced today, there is the opportunity to leverage the bigquery storage api for reading tables from bq. In theory it should have lower latency than gcs dumps and also be able to leverage predicate pushdowns and column projection while also being avro based.

Are there any plans to integrate the storage all with this or another spark dataframe project?

@smith-m
Copy link
Author

smith-m commented Feb 23, 2019

cloud.google.com/bigquery/docs/reference/storage/

@samelamin
Copy link
Owner

This is really interesting, thanks for sharing.

Yeah it'll need a separate branch while it's on beta but certainly worth looking into

Or better yet utilising it via an option but using gcs dumps by default

I'll have a look over the coming weeks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants