Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand databases benchmarked #3

Closed
russellpierce opened this issue Oct 9, 2017 · 3 comments
Closed

Expand databases benchmarked #3

russellpierce opened this issue Oct 9, 2017 · 3 comments

Comments

@russellpierce
Copy link

Athena and Redshift Spectrum are BigQuery-like. One wonders hope they would stack up in this comparison.

@georgewfraser
Copy link
Contributor

I definitely want to add these. The tricky part of this is that you have to make a bunch of choices when you put the data in S3. Even if we just decide to go with Parquet, we have to decide how big to make the files and the blocks within the files. So it will take some fiddling around to be sure we're being fair to Athena/Spectrum.

@russellpierce
Copy link
Author

I agree. Apples to apples is hard - if not impossible. Maybe I'm atypical, but as far as I'm concerned defaults or a (very) limited parameter search is reasonable. Unlike with Redshift's sort keys, where Amazon makes a big deal about optimizing for your query-type, Athena and Spectrum both seem to be advertised more or less as 'drop on S3 and go'. It seems fair to take them more or less at their word for the benchmark or a very limited search of the parameter space.

@mike-weinberg
Copy link
Collaborator

@russellpierce as I mentioned in #13 Athena/Spectrum have been shown elsewhere to be so sensitive to file types that getting performance to approach that of normal warehouses requires significant work, as well as that Athena and Spectrum are at best par with bigquery after all optimization is done (as you were hinting at when you said "bigquery-like")

For that reason I will be closing this issue (for now!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants