Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError in MLPerf Inference with ResNet-50 #1233

Closed
sahilavaran opened this issue Apr 27, 2024 · 5 comments
Closed

KeyError in MLPerf Inference with ResNet-50 #1233

sahilavaran opened this issue Apr 27, 2024 · 5 comments

Comments

@sahilavaran
Copy link

sahilavaran commented Apr 27, 2024

I encountered a KeyError: 'target_latency' while running the following command:

cm run script --tags=run-mlperf,inference,_submission,_all-scenarios --model=resnet50  \
--device=cpu --implementation=reference --backend=onnxruntime --execution-mode=valid \
--category=edge --division=open --quiet

This error occurred when running the benchmark. It seems like the target_latency metric is missing from the performance summary. Here's the full error message:

return {'return': 1, 'error': f'No {metric} found in performance summary. Pattern checked "{pattern[metric]}"'}
KeyError: 'target_latency'

Could you please help resolve this issue?

Screenshot 2024-04-27 072859

@arjunsuresh
Copy link
Contributor

Hi @sahilavaran can you please share the full console output?

@sahilavaran
Copy link
Author

sahilavaran commented Apr 27, 2024

Hi @arjunsuresh Please find the full console output attached. Let me know if you need any additional information.



               ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/extract-file/customize.py
             ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/get-dataset-imagenet-val/customize.py

      * cm run script "get dataset-aux image-classification imagenet-aux"
           ! load /home/sahil/CM/repos/local/cache/0693d5c1b6c54082/cm-cached-state.json

      * cm run script "get generic-python-lib _package.opencv-python-headless"
           ! load /home/sahil/CM/repos/local/cache/f87b88b438b749d8/cm-cached-state.json

      * cm run script "get generic-python-lib _pillow"
           ! load /home/sahil/CM/repos/local/cache/d320b37ea2fb411d/cm-cached-state.json

      * cm run script "mlperf mlcommons inference source src"
           ! load /home/sahil/CM/repos/local/cache/b79d1bebe7c5426a/cm-cached-state.json

Path to the MLPerf inference benchmark configuration file: /home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference/mlperf.conf
Path to MLPerf inference benchmark sources: /home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference

Using MLCommons Inference source from '/home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference'
           ! cd /home/sahil/CM/repos/local/cache/69aa3db6ab5b434f
           ! call /home/sahil/CM/repos/mlcommons@cm4mlops/script/get-preprocessed-dataset-imagenet/run.sh from tmp-run.sh
INFO:imagenet:Preprocessing 50000 images using 12 threads
INFO:imagenet:loaded 50000 images, cache=True, already_preprocessed=False, took=87.7sec
           ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/get-preprocessed-dataset-imagenet/customize.py

    * cm run script "get dataset-aux image-classification imagenet-aux"
         ! load /home/sahil/CM/repos/local/cache/0693d5c1b6c54082/cm-cached-state.json

    * cm run script "generate user-conf mlperf inference"

      * cm run script "detect os"
             ! cd /home/sahil/projects/ck/docs/mlperf/inference/resnet50
             ! call /home/sahil/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
             ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py

      * cm run script "detect cpu"

        * cm run script "detect os"
               ! cd /home/sahil/projects/ck/docs/mlperf/inference/resnet50
               ! call /home/sahil/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
               ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
             ! cd /home/sahil/projects/ck/docs/mlperf/inference/resnet50
             ! call /home/sahil/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
             ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py

      * cm run script "get python"
           ! load /home/sahil/CM/repos/local/cache/c425dfdde43e4604/cm-cached-state.json

Path to Python: /usr/bin/python3
Python version: 3.10.12


      * cm run script "get mlcommons inference src"
           ! load /home/sahil/CM/repos/local/cache/b79d1bebe7c5426a/cm-cached-state.json

Path to the MLPerf inference benchmark configuration file: /home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference/mlperf.conf
Path to MLPerf inference benchmark sources: /home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference


      * cm run script "get sut configs"
           ! load /home/sahil/CM/repos/local/cache/0f56cb10ba8b49d3/cm-cached-state.json
Using MLCommons Inference source from '/home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference'
Original configuration value 0.1 target_latency
Adjusted configuration value 0.04000000000000001 target_latency
Output Dir: '/home/sahil/projects/ck/docs/mlperf/inference/resnet50/valid_results/Sahil-reference-cpu-onnxruntime-v1.17.3-default_config/resnet50/singlestream/performance/run_1'
resnet50.SingleStream.target_latency = 0.04000000000000001
resnet50.SingleStream.max_duration = 660000 


    * cm run script "get loadgen"
         ! load /home/sahil/CM/repos/local/cache/5734b7f0ec4c4f0a/cm-cached-state.json

Path to the tool: /home/sahil/CM/repos/local/cache/5734b7f0ec4c4f0a/install


    * cm run script "get mlcommons inference src"
         ! load /home/sahil/CM/repos/local/cache/b79d1bebe7c5426a/cm-cached-state.json

Path to the MLPerf inference benchmark configuration file: /home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference/mlperf.conf
Path to MLPerf inference benchmark sources: /home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference


    * cm run script "get mlcommons inference src"
         ! load /home/sahil/CM/repos/local/cache/b79d1bebe7c5426a/cm-cached-state.json

Path to the MLPerf inference benchmark configuration file: /home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference/mlperf.conf
Path to MLPerf inference benchmark sources: /home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference


    * cm run script "get generic-python-lib _package.psutil"
         ! load /home/sahil/CM/repos/local/cache/cacc94bd00e44e61/cm-cached-state.json

    * cm run script "get generic-python-lib _opencv-python"
         ! load /home/sahil/CM/repos/local/cache/cbe810cfdbc34943/cm-cached-state.json

    * cm run script "get generic-python-lib _numpy"
         ! load /home/sahil/CM/repos/local/cache/f420c6d5b8074433/cm-cached-state.json

    * cm run script "get generic-python-lib _pycocotools"
         ! load /home/sahil/CM/repos/local/cache/567c627067294fd8/cm-cached-state.json
Using MLCommons Inference source from '/home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference'
         ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/app-mlperf-inference-mlcommons-python/customize.py

  * cm run script "benchmark-mlperf"
         ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/benchmark-program-mlperf/customize.py

  * cm run script "benchmark-program program"

    * cm run script "detect cpu"

      * cm run script "detect os"
             ! cd /home/sahil/projects/ck/docs/mlperf/inference/resnet50
             ! call /home/sahil/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
             ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
           ! cd /home/sahil/projects/ck/docs/mlperf/inference/resnet50
           ! call /home/sahil/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
           ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
***************************************************************************
CM script::benchmark-program/run.sh

Run Directory: /home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference/vision/classification_and_detection

CMD: ./run_local.sh onnxruntime resnet50 cpu --scenario SingleStream    --mlperf_conf '/home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference/mlperf.conf' --threads 12 --user_conf '/home/sahil/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/8c13d355074a41459d020855f4aa6eb7.conf' --use_preprocessed_dataset --cache_dir /home/sahil/CM/repos/local/cache/69aa3db6ab5b434f --dataset-list /home/sahil/CM/repos/local/cache/0693d5c1b6c54082/data/val.txt 2>&1 | tee /home/sahil/projects/ck/docs/mlperf/inference/resnet50/valid_results/Sahil-reference-cpu-onnxruntime-v1.17.3-default_config/resnet50/singlestream/performance/run_1/console.out

         ! cd /home/sahil/projects/ck/docs/mlperf/inference/resnet50
         ! call /home/sahil/CM/repos/mlcommons@cm4mlops/script/benchmark-program/run-ubuntu.sh from tmp-run.sh
python3 python/main.py --profile resnet50-onnxruntime --mlperf_conf ../../mlperf.conf --model "/home/sahil/CM/repos/local/cache/9009944706304796/resnet50_v1.onnx" --dataset-path /home/sahil/CM/repos/local/cache/69aa3db6ab5b434f --output "/home/sahil/projects/ck/docs/mlperf/inference/resnet50/valid_results/Sahil-reference-cpu-onnxruntime-v1.17.3-default_config/resnet50/singlestream/performance/run_1" --scenario SingleStream --mlperf_conf /home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference/mlperf.conf --threads 12 --user_conf /home/sahil/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/8c13d355074a41459d020855f4aa6eb7.conf --use_preprocessed_dataset --cache_dir /home/sahil/CM/repos/local/cache/69aa3db6ab5b434f --dataset-list /home/sahil/CM/repos/local/cache/0693d5c1b6c54082/data/val.txt
INFO:main:Namespace(dataset='imagenet', dataset_path='/home/sahil/CM/repos/local/cache/69aa3db6ab5b434f', dataset_list='/home/sahil/CM/repos/local/cache/0693d5c1b6c54082/data/val.txt', data_format=None, profile='resnet50-onnxruntime', scenario='SingleStream', max_batchsize=32, model='/home/sahil/CM/repos/local/cache/9009944706304796/resnet50_v1.onnx', output='/home/sahil/projects/ck/docs/mlperf/inference/resnet50/valid_results/Sahil-reference-cpu-onnxruntime-v1.17.3-default_config/resnet50/singlestream/performance/run_1', inputs=None, outputs=['ArgMax:0'], backend='onnxruntime', model_name='resnet50', threads=12, qps=None, cache=0, cache_dir='/home/sahil/CM/repos/local/cache/69aa3db6ab5b434f', preprocessed_dir=None, use_preprocessed_dataset=True, accuracy=False, find_peak_performance=False, debug=False, mlperf_conf='/home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference/mlperf.conf', user_conf='/home/sahil/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/8c13d355074a41459d020855f4aa6eb7.conf', audit_conf='audit.config', time=None, count=None, performance_sample_count=None, max_latency=None, samples_per_query=8)
INFO:imagenet:Loading 50000 preprocessed images using 12 threads
INFO:imagenet:loaded 50000 images, cache=0, already_preprocessed=True, took=0.8sec
INFO:main:starting TestScenario.SingleStream
./run_local.sh: line 25: 525479 Killed                  python3 python/main.py --profile resnet50-onnxruntime --mlperf_conf ../../mlperf.conf --model "/home/sahil/CM/repos/local/cache/9009944706304796/resnet50_v1.onnx" --dataset-path /home/sahil/CM/repos/local/cache/69aa3db6ab5b434f --output "/home/sahil/projects/ck/docs/mlperf/inference/resnet50/valid_results/Sahil-reference-cpu-onnxruntime-v1.17.3-default_config/resnet50/singlestream/performance/run_1" --scenario SingleStream --mlperf_conf /home/sahil/CM/repos/local/cache/8fb1e0ec1b3e43b0/inference/mlperf.conf --threads 12 --user_conf /home/sahil/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/8c13d355074a41459d020855f4aa6eb7.conf --use_preprocessed_dataset --cache_dir /home/sahil/CM/repos/local/cache/69aa3db6ab5b434f --dataset-list /home/sahil/CM/repos/local/cache/0693d5c1b6c54082/data/val.txt
         ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/benchmark-program/customize.py

  * cm run script "save mlperf inference state"
         ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/save-mlperf-inference-implementation-state/customize.py
       ! cd /home/sahil/projects/ck/docs/mlperf/inference/resnet50
       ! call /home/sahil/CM/repos/mlcommons@cm4mlops/script/app-mlperf-inference/run.sh from tmp-run.sh
       ! call "postprocess" from /home/sahil/CM/repos/mlcommons@cm4mlops/script/app-mlperf-inference/customize.py

* cm run script "get mlperf sut description"
     ! load /home/sahil/CM/repos/local/cache/1e49ffa391c4438f/cm-cached-state.json


Traceback (most recent call last):
  File "/home/sahil/.local/bin/cm", line 8, in <module>
    sys.exit(run())
  File "/home/sahil/.local/lib/python3.10/site-packages/cmind/cli.py", line 35, in run
    r = cm.access(argv, out='con')
  File "/home/sahil/.local/lib/python3.10/site-packages/cmind/core.py", line 600, in access
    r = action_addr(i)
  File "/home/sahil/CM/repos/mlcommons@cm4mlops/automation/script/module.py", line 211, in run
    r = self._run(i)
  File "/home/sahil/CM/repos/mlcommons@cm4mlops/automation/script/module.py", line 1466, in _run
    r = customize_code.preprocess(ii)
  File "/home/sahil/CM/repos/mlcommons@cm4mlops/script/run-mlperf-inference-app/customize.py", line 215, in preprocess
    r = cm.access(ii)
  File "/home/sahil/.local/lib/python3.10/site-packages/cmind/core.py", line 756, in access
    return cm.access(i)
  File "/home/sahil/.local/lib/python3.10/site-packages/cmind/core.py", line 600, in access
    r = action_addr(i)
  File "/home/sahil/CM/repos/mlcommons@cm4mlops/automation/script/module.py", line 211, in run
    r = self._run(i)
  File "/home/sahil/CM/repos/mlcommons@cm4mlops/automation/script/module.py", line 1544, in _run
    r = prepare_and_run_script_with_postprocessing(run_script_input)
  File "/home/sahil/CM/repos/mlcommons@cm4mlops/automation/script/module.py", line 4537, in prepare_and_run_script_with_postprocessing
    rr = run_postprocess(customize_code, customize_common_input, recursion_spaces, env, state, const,
  File "/home/sahil/CM/repos/mlcommons@cm4mlops/automation/script/module.py", line 4589, in run_postprocess
    r = customize_code.postprocess(ii)
  File "/home/sahil/CM/repos/mlcommons@cm4mlops/script/app-mlperf-inference/customize.py", line 142, in postprocess
    return {'return': 1, 'error': f'No {metric} found in performance summary. Pattern checked "{pattern[metric]}"'}
KeyError: 'target_latency'
➜  resnet50 git:(master) ✗ 

@arjunsuresh
Copy link
Contributor

Hi @sahilavaran did you manage to resolve the process killed error?

@sahilavaran
Copy link
Author

Hi @arjunsuresh Yes, I managed to resolve the process killed error. Thank you for checking in.

@arjunsuresh
Copy link
Contributor

Cool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants