TPC-H Benchmark: item run limitation does not work in suffled mode #2389

mweisgut · 2021-07-30T13:56:15Z

Executing the TPC-H benchmark in shuffled mode (-m Shuffled) with a runs-per-item limitation of x and a time limit that is high enough so that the query can be executed x times results in an incorrect number of executions.

Steps to Reproduce

Execute ./hyriseBenchmarkTPCH -t 9999999 -r 10 -m Shuffled -s 1 -o output.json

Expected Behavior

Each query item is executed 10 times.

Actual Behavior

The total number of query item executions is 10. Each item is executed at most once. With a higher runs-per-item limitation, the number of runs per item can also be higher.

Log (click to expand)

- Writing benchmark results to 'output.json'
- Running in single-threaded mode
- 1 simulated client is scheduling items
- Running benchmark in 'Shuffled' mode
- Encoding is 'Dictionary'
- Chunk size is 65535
- Max runs per item is 10
- Max duration per item is 9999999 seconds
- No warmup runs are performed
- Caching tables as binary files
- Not tracking SQL metrics
- Benchmarking Queries: [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 ]
- TPC-H scale factor is 1
- Using prepared statements: no
- Loading/Generating tables
-  Loading table 'supplier' from cached binary "tpch_cached_tables/sf-1.000000/supplier.bin" (3 ms 20 µs)
-  Loading table 'lineitem' from cached binary "tpch_cached_tables/sf-1.000000/lineitem.bin" (559 ms 440 µs)
-  Loading table 'orders' from cached binary "tpch_cached_tables/sf-1.000000/orders.bin" (157 ms 628 µs)
-  Loading table 'region' from cached binary "tpch_cached_tables/sf-1.000000/region.bin" (34 µs 209 ns)
-  Loading table 'part' from cached binary "tpch_cached_tables/sf-1.000000/part.bin" (24 ms 364 µs)
-  Loading table 'customer' from cached binary "tpch_cached_tables/sf-1.000000/customer.bin" (40 ms 356 µs)
-  Loading table 'nation' from cached binary "tpch_cached_tables/sf-1.000000/nation.bin" (64 µs 127 ns)
-  Loading table 'partsupp' from cached binary "tpch_cached_tables/sf-1.000000/partsupp.bin" (123 ms 406 µs)
- Loading/Generating tables done (908 ms 616 µs)
- Encoding tables (if necessary) and generating pruning statistics
-  Encoding 'nation' - no encoding necessary (1 ms 178 µs)
-  Encoding 'region' - no encoding necessary (2 ms 982 µs)
-  Encoding 'supplier' - no encoding necessary (5 ms 707 µs)
-  Encoding 'part' - no encoding necessary (19 ms 369 µs)
-  Encoding 'customer' - no encoding necessary (21 ms 407 µs)
-  Encoding 'partsupp' - no encoding necessary (38 ms 36 µs)
-  Encoding 'orders' - no encoding necessary (151 ms 94 µs)
-  Encoding 'lineitem' - no encoding necessary (473 ms 42 µs)
- Encoding tables and generating pruning statistic done (474 ms 89 µs)
- Writing tables into binary files if necessary
- Writing tables into binary files done (33 µs 780 ns)
- Adding tables to StorageManager and generating table statistics
-  Added 'nation' (5 ms 17 µs)
-  Added 'region' (7 ms 503 µs)
-  Added 'supplier' (105 ms 547 µs)
-  Added 'customer' (348 ms 481 µs)
-  Added 'part' (406 ms 82 µs)
-  Added 'partsupp' (1 s 338 ms)
-  Added 'orders' (2 s 166 ms)
-  Added 'lineitem' (8 s 104 ms)
- Adding tables to StorageManager and generating table statistics done (8 s 105 ms)
- No indexes created as --indexes was not specified or set to false
- Starting Benchmark...
[PERF] Unresolved iterator created for AbstractPosList at src/lib/storage/pos_lists/abstract_pos_list.cpp:5
	Performance can be affected. This warning is only shown once.

[PERF] ColumnVsColumnTableScan using type-erased iterators at src/lib/operators/table_scan/column_vs_column_table_scan_impl.cpp:113
	Performance can be affected. This warning is only shown once.

[PERF] Using type-erased accessor as the ReferenceSegmentIterable is type-erased itself at src/lib/storage/reference_segment/reference_segment_iterable.hpp:93
	Performance can be affected. This warning is only shown once.

- Results for TPC-H 01
  -> Executed 1 times
- Results for TPC-H 02
  -> Executed 0 times
- Results for TPC-H 03
  -> Executed 1 times
- Results for TPC-H 04
  -> Executed 0 times
- Results for TPC-H 05
  -> Executed 1 times
- Results for TPC-H 06
  -> Executed 0 times
- Results for TPC-H 07
  -> Executed 1 times
- Results for TPC-H 08
  -> Executed 0 times
- Results for TPC-H 09
  -> Executed 1 times
- Results for TPC-H 10
  -> Executed 0 times
- Results for TPC-H 11
  -> Executed 1 times
- Results for TPC-H 12
  -> Executed 1 times
- Results for TPC-H 13
  -> Executed 0 times
- Results for TPC-H 14
  -> Executed 0 times
- Results for TPC-H 15
  -> Executed 1 times
- Results for TPC-H 16
  -> Executed 0 times
- Results for TPC-H 17
  -> Executed 1 times
- Results for TPC-H 18
  -> Executed 0 times
- Results for TPC-H 19
  -> Executed 1 times
- Results for TPC-H 20
  -> Executed 0 times
- Results for TPC-H 21
  -> Executed 0 times
- Results for TPC-H 22
  -> Executed 0 times

JSON output: output.json.log

Build Information

CMake command

cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -GNinja -DCMAKE_CXX_FLAGS=-fcolor-diagnostics

LLVM

➜  ~ brew info llvm
llvm: stable 12.0.1 (bottled), HEAD [keg-only]

The text was updated successfully, but these errors were encountered:

mweisgut · 2021-07-30T14:29:48Z

Or is it actually expected behavior?

Bouncner · 2021-07-31T08:09:10Z

Or is it actually expected behavior?

I would definitely say no. I am not even sure if we have ever considered using runs with multiple clients.

Bensk1 · 2021-08-02T08:33:15Z

Or is it actually expected behavior?

Even if it would be, the documentation would be wrong: Maximum number of runs per item

I am not even sure if we have ever considered using runs with multiple clients.

This issue is independent of multiple clients but an issue of shuffled, isn't it?

mweisgut · 2021-08-02T09:25:39Z

This issue is independent of multiple clients but an issue of shuffled, isn't it?

Right

Bouncner · 2021-08-03T07:50:52Z

Yes, but not counting it as runs per client doesn’t make sense imo.

Bouncner · 2021-08-13T08:35:07Z

What about not supporting runs when the shuffled mode is used? Simply an assert and a message such as The shuffled mode does not support limiting the number of benchmark runs. Use --time to set a time limit for the benchmark run.

Bouncner · 2021-08-13T11:38:23Z

And while we're at it: we could change the following output for shuffled runs - Max duration per item is 2400 seconds.

Bensk1 · 2021-08-13T14:49:44Z

What about not supporting runs when the shuffled mode is used?

Sounds good to me.

mweisgut added the Bug label Jul 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TPC-H Benchmark: item run limitation does not work in suffled mode #2389

TPC-H Benchmark: item run limitation does not work in suffled mode #2389

mweisgut commented Jul 30, 2021 •

edited

mweisgut commented Jul 30, 2021

Bouncner commented Jul 31, 2021

Bensk1 commented Aug 2, 2021 •

edited by mweisgut

mweisgut commented Aug 2, 2021 •

edited

Bouncner commented Aug 3, 2021

Bouncner commented Aug 13, 2021

Bouncner commented Aug 13, 2021

Bensk1 commented Aug 13, 2021

TPC-H Benchmark: item run limitation does not work in suffled mode #2389

TPC-H Benchmark: item run limitation does not work in suffled mode #2389

Comments

mweisgut commented Jul 30, 2021 • edited

Steps to Reproduce

Expected Behavior

Actual Behavior

Build Information

mweisgut commented Jul 30, 2021

Bouncner commented Jul 31, 2021

Bensk1 commented Aug 2, 2021 • edited by mweisgut

mweisgut commented Aug 2, 2021 • edited

Bouncner commented Aug 3, 2021

Bouncner commented Aug 13, 2021

Bouncner commented Aug 13, 2021

Bensk1 commented Aug 13, 2021

mweisgut commented Jul 30, 2021 •

edited

Bensk1 commented Aug 2, 2021 •

edited by mweisgut

mweisgut commented Aug 2, 2021 •

edited