Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement feedback on BQWT #231

Open
nagarkar opened this issue Apr 8, 2019 · 0 comments
Open

Implement feedback on BQWT #231

nagarkar opened this issue Apr 8, 2019 · 0 comments
Assignees
Labels
BQWorkloadTester issues related to the bigquery workload tester

Comments

@nagarkar
Copy link
Collaborator

nagarkar commented Apr 8, 2019

First of all, thanks for writing this tool. I tried it today as part of writing the performance section of the BigQuery book we are writing. Here's how I used it:
https://github.com/GoogleCloudPlatform/bigquery-oreilly-book/blob/master/07_perf/time_bqwt.sh

A few suggestions:

(1) Please provide a way for the user to specify how many times to test. Typically, we want to average measurements and report 10th and 90th percentiles. (this is different from concurrency)
(2) The samples on GitHub show non-standard SQL. The code, however, sets legacySQL to false. standard SQL is the right choice, but please update the genomics query in the sample config. This took a while to chase down because it turns out that I also had to escape the backslash in the project name.
(3) The output dir setting in sample config was tricky to get right. It appears that different jobs get started in different directories, so specifying a relative directory (as in the sample) didn't work. I had to specify an absolute dir (again, the absolute dir is the right choice, but please update the sample)
(4) The results JSON could be more helpful if you were to aggregate the JSON results and report arrays of wallTime, runTime, etc. for each concurrency level. This would make it easier to import the data into plotting libraries.
(5) The query file had to be a single line of text. This is quite unfriendly since real queries tend to quite long. I used tr to retain readability, but it would be good if you treated each queryFile as a file instead of reading it line-by-line.
(6) typo in the second word of this log: "Finished bechmarking phase"

@nagarkar nagarkar assigned nagarkar and ldanielmadariaga and unassigned nagarkar Apr 8, 2019
@nagarkar nagarkar added the BQWorkloadTester issues related to the bigquery workload tester label Apr 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BQWorkloadTester issues related to the bigquery workload tester
Projects
None yet
Development

No branches or pull requests

2 participants