Missing whl nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl when running bert-99 or rnnt #1197

stbailey001 · 2024-04-12T15:22:06Z

When trying to run bert-99 or rnnt inference. always fails with:
/root/cm/bin/python3 -m pip install "/opt/nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl"
WARNING: Requirement '/opt/nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl' looks like a filename, but the file does not exist
ERROR: nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl is not a supported wheel on this platform.

Cmd line:
cm run script --tags=run-mlperf,inference,_r3.0,_performance-only,_short --division=closed --category=datacenter --device=cuda --model=bert-99 --precision=float32 --implementation=nvidia --backend=tensorrt --scenario=Offline --execution_mode=test --power=no --adr.python.version_min=3.8 --clean --compliance=yes --quiet --time

Here is the part of thel output: when it fails.
* cm run script "get sut configs"
! load /root/CM/repos/local/cache/b5cfc33b4e7546d7/cm-cached-state.json
Using MLCommons Inference source from '/root/CM/repos/local/cache/a867d6853aeb4402/inference'
No target_qps specified. Using 1 as target_qps
Output Dir: '/root/test_results/vm6-nvidia_original-gpu-tensorrt-vdefault-default_config/gptj-99/offline/performance/run_1'
gptj.Offline.target_qps = 1
gptj.Offline.max_query_count = 10
gptj.Offline.min_query_count = 10
gptj.Offline.min_duration = 0

       ! call "postprocess" from /root/CM/repos/mlcommons@ck/cm-mlops/script/generate-mlperf-inference-user-conf/customize.py

* cm run script "get generic-python-lib _package.nvmitten _path./opt/nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl"

  * cm run script "detect os"
         ! cd /root/CM/repos/local/cache/525559fdba5e465a
         ! call /root/CM/repos/mlcommons@ck/cm-mlops/script/detect-os/run.sh from tmp-run.sh
         ! call "postprocess" from /root/CM/repos/mlcommons@ck/cm-mlops/script/detect-os/customize.py

  * cm run script "detect cpu"

    * cm run script "detect os"
           ! cd /root/CM/repos/local/cache/525559fdba5e465a
           ! call /root/CM/repos/mlcommons@ck/cm-mlops/script/detect-os/run.sh from tmp-run.sh
           ! call "postprocess" from /root/CM/repos/mlcommons@ck/cm-mlops/script/detect-os/customize.py
         ! cd /root/CM/repos/local/cache/525559fdba5e465a
         ! call /root/CM/repos/mlcommons@ck/cm-mlops/script/detect-cpu/run.sh from tmp-run.sh
         ! call "postprocess" from /root/CM/repos/mlcommons@ck/cm-mlops/script/detect-cpu/customize.py

  * cm run script "get python3"
       ! load /root/CM/repos/local/cache/84bdcf96cbd64e32/cm-cached-state.json

  * cm run script "get generic-python-lib _pip"
       ! load /root/CM/repos/local/cache/2da17785e356495b/cm-cached-state.json
           ! cd /root/CM/repos/local/cache/525559fdba5e465a
           ! call /root/CM/repos/mlcommons@ck/cm-mlops/script/get-generic-python-lib/run.sh from tmp-run.sh
           ! call "detect_version" from /root/CM/repos/mlcommons@ck/cm-mlops/script/get-generic-python-lib/customize.py

      Extra PIP CMD: 

       ! cd /root/CM/repos/local/cache/525559fdba5e465a
       ! call /root/CM/repos/mlcommons@ck/cm-mlops/script/get-generic-python-lib/install.sh from tmp-run.sh

/root/cm/bin/python3 -m pip install "/opt/nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl"
WARNING: Requirement '/opt/nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl' looks like a filename, but the file does not exist
ERROR: nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl is not a supported wheel on this platform.

CM error: Portable CM script failed (name = get-generic-python-lib, return code = 256)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Note that it may be a portability issue of a third-party tool or a native script
wrapped and unified by this automation recipe (CM script). In such case,
please report this issue with a full log at "https://github.com/mlcommons/ck".
The CM concept is to collaboratively fix such issues inside portable CM scripts
to make existing tools and native scripts more portable, interoperable
and deterministic. Thank you!

The text was updated successfully, but these errors were encountered:

thehalfspace · 2024-04-12T19:55:50Z

I have the same issue (mlcommons/inference#1679).

I'm working with arm architecture so it fails. It seems mitten is open source now, so if you git clone https://github.com/NVIDIA/mitten in your directory $PYTHONUSERBASE/lib/python3.10/site-packages/ and run pip install . --user inside that, it works.

stbailey001 · 2024-04-12T20:26:41Z

I'll give that a try. Thanks I was wondering if NVIDA/mitten was the source. You have confirmed it.

arjunsuresh · 2024-04-12T20:38:15Z

yes, nvmitten is currently supported only via cm docker and not cm run as the whl file is coming from Nvidia MLPerf container. Since the nvmitten repo is now open source, we have added the installation support in this PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing whl nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl when running bert-99 or rnnt #1197

Missing whl nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl when running bert-99 or rnnt #1197

stbailey001 commented Apr 12, 2024

thehalfspace commented Apr 12, 2024

stbailey001 commented Apr 12, 2024

arjunsuresh commented Apr 12, 2024

Missing whl nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl when running bert-99 or rnnt #1197

Missing whl nvmitten-0.1.3-cp38-cp38-linux_x86_64.whl when running bert-99 or rnnt #1197

Comments

stbailey001 commented Apr 12, 2024

thehalfspace commented Apr 12, 2024

stbailey001 commented Apr 12, 2024

arjunsuresh commented Apr 12, 2024