Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory benchmark #417

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Memory benchmark #417

wants to merge 1 commit into from

Conversation

pizhenwei
Copy link

@pizhenwei pizhenwei commented Jun 9, 2021

memory benchmark: support auto-detect CPU L3 cache

During writing to a small memory block, the CPU just writes to L3
cache. So use a memory block size larger than L3 cache size should be
better. Allow to specify --memory-block-size=0, then sysbench
auto-detect CPU L3 cache size and alignup to power of 2 to do test
work.

For example:
Orignally, run this command on my PC and got a result 47634.81 MiB/sec
 # sysbench memory --memory-scope=local --threads=12 run

In face, the real performance is about 15G/s. The test result gets
about 300% deviation.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>

During writing to a small memory block, the CPU just writes to L3
cache. So use a memory block size larger than L3 cache size should be
better. Allow to specify --memory-block-size=0, then sysbench
auto-detect CPU L3 cache size and alignup to power of 2 to do test
work.

For example:
Orignally, run this command on my PC and got a result 47634.81 MiB/sec
 # sysbench memory --memory-scope=local --threads=12 run

In face, the real performance is about 15G/s. The test result gets
about 300% deviation.

A test case should be added in test_memory.t, but the github CI failed.
I tried to reproduce on an ARM server, but this still work well. Maybe
this test case could be added in the future.

  $ sysbench memory --memory-scope=local --memory-oper=write --memory-total-size=1G --memory-block-size=0 --events=1 --time=0 --threads=2 run
  sysbench *.* * (glob)

  Running the test with following options:
  Number of threads: 2
  Initializing random number generator from current time

  Running memory speed test with the following options:
    block size: * (glob)
    total size: 1024MiB
    operation: write
    scope: local

  Initializing worker threads...

  Threads started!

  Total operations: * (* per second) (glob)

  1024.00 MiB transferred (* MiB/sec) (glob)

  Throughput:
      events/s (eps): * (glob)
      time elapsed:                        *s (glob)
      total number of events:              * (glob)

  Latency (ms):
           min:                              *.* (glob)
           avg:                              *.* (glob)
           max:                              *.* (glob)
           95th percentile:         *.* (glob)
           sum: *.* (glob)

  Threads fairness:
      events (avg/stddev):           */* (glob)
      execution time (avg/stddev):   */* (glob)

  $ sysbench $args cleanup
  sysbench *.* * (glob)

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
@pizhenwei
Copy link
Author

@akopytov PING!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant