New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why the results both on CPU and GPU are same? #1180
Comments
There may be several potential issues with CUDA run - if CUDA is not installed properly or fails on your system, ONNX may revert to CPU. I didn't see that cases before but assume that this is the case. Also, we usually do not mix CPU and CUDA installation so you need to clean CM cache in between such runs:
Maybe you can clean the cache and rerun above command with --device=cuda and submit the full log? |
After running the command(cm rm cache -f), I run cmr "run mlperf inference generate-run-cmds _submission" --quiet --submitter="MLCommons" --hw_name=default --model=bert-99 --implementation=reference --backend=onnxruntime --device=cuda --scenario=Offline --adr.compiler.tags=gcc --target_qps=1 --category=edge --division=open GPU Device ID: 0
Generating SUT description file for default-onnxruntime /home/zhaohc/CM/repos/local/cache/d071d1318a114521/inference/loadgen/logging.cc: In member function ‘void mlperf::logging::AsyncLog::RecordTokenCompletion(uint64_t, std::chrono::_V2::system_clock::time_point, mlperf::QuerySampleLatency)’: SUT: default-reference-gpu-onnxruntime-v1.17.1-default_config, model: bert-99, scenario: Offline, target_qps updated as 44.1568 |
When I run
cmr "run mlperf inference generate-run-cmds _submission" --quiet --submitter="MLCommons" --hw_name=default --model=bert-99 --implementation=reference --backend=onnxruntime --device=cuda --scenario=Offline --adr.compiler.tags=gcc --target_qps=1 --category=edge --division=open
default-reference-gpu-onnxruntime-v1.17.1-default_config
+---------+----------+----------+--------+-----------------+---------------------------------+
| Model | Scenario | Accuracy | QPS | Latency (in ms) | Power Efficiency (in samples/J) |
+---------+----------+----------+--------+-----------------+---------------------------------+
| bert-99 | Offline | X () | 44.157 | - | |
+---------+----------+----------+--------+-----------------+---------------------------------+
when I run
cmr "run mlperf inference generate-run-cmds _submission" --quiet --submitter="MLCommons" --hw_name=default --model=bert-99 --implementation=reference --backend=onnxruntime --device=cpu --scenario=Offline --adr.compiler.tags=gcc --target_qps=1 --category=edge --division=open
default-reference-gpu-onnxruntime-v1.17.1-default_config
+---------+----------+----------+--------+-----------------+---------------------------------+
| Model | Scenario | Accuracy | QPS | Latency (in ms) | Power Efficiency (in samples/J) |
+---------+----------+----------+--------+-----------------+---------------------------------+
| bert-99 | Offline | X () | 44.157 | - | |
+---------+----------+----------+--------+-----------------+---------------------------------+
I change the "--device"
Why the results is same?
The text was updated successfully, but these errors were encountered: