hypo.word file missing during MMS ASR inference #5117

ahazeemi · 2023-05-22T20:38:01Z

❓ Questions and Help

What is your question?

I'm facing the following issue while running the MMS ASR inference script examples/mms/asr/infer/mms_infer.py:

  File "/workspace/fairseq/examples/mms/asr/infer/mms_infer.py", line 52, in <module>
    process(args)
  File "/workspace/fairseq/examples/mms/asr/infer/mms_infer.py", line 44, in process
    with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/workspace/tmpsjatjyxt/hypo.word'

Code

python examples/mms/asr/infer/mms_infer.py --model "/workspace/fairseq/mms1b_fl102.pt" --lang "urd-script_arabic" --audio "/workspace/audio.wav"

What have you tried?

Tried running the ASR on different audios and languages

What's your environment?

fairseq Version (e.g., 1.0 or main): main
PyTorch Version (e.g., 1.0): 2.0.0
OS (e.g., Linux): Linux
How you installed fairseq (pip, source): pip
Build command you used (if compiling from source): N/A
Python version: 3.10.10
CUDA/cuDNN version: 11.6
GPU models and configuration: NVIDIA A6000
Any other relevant information: N/A

The text was updated successfully, but these errors were encountered:

shsagnik · 2023-05-22T21:09:15Z

Facing the exact same issue

vineelpratap · 2023-05-22T21:10:18Z

Hi, can you share the entire log? I just tested the code again and it works fine from my end.

audiolion · 2023-05-22T21:12:03Z

you need to check what the error is, change your mms_infer.py to

out = subprocess.run(cmd, check=True, shell=True, stdout=subprocess.DEVNULL,)
print(out)

to see the error, for me it was I needed to pass cpu=True because I don't have CUDA installed. I did this by modifying my infer_common.yml file to have a new top level key common with the cpu: true key/val in it

common:
  cpu: true

audiolion · 2023-05-22T21:15:19Z

I am hitting this though and I am not sure what I am doing wrong, not sure if I am using the right lang_code, it doesn't say what the lang codes are or what standard it is referencing, I have tried en and en-US so far.

shsagnik · 2023-05-22T21:17:21Z

Sure here is the full log of mine

(base) hello_automate_ai@machinelearningnotebook:~/fairseqmmstest/fairseq$ python "examples/mms/asr/infer/mms_infer.py" --model "/home/hello_automate_ai/fairseqmmstest/mms1b_all.pt" --lang hin --audio "/home/hello_automate_ai/fairseqmmstest/audio.wav"
preparing tmp manifest dir ...
loading model & running inference ...
Traceback (most recent call last):
File "/home/hello_automate_ai/fairseqmmstest/fairseq/examples/speech_recognition/new/infer.py", line 18, in
import editdistance
ModuleNotFoundError: No module named 'editdistance'
Traceback (most recent call last):
File "/home/hello_automate_ai/fairseqmmstest/fairseq/examples/mms/asr/infer/mms_infer.py", line 52, in
process(args)
File "/home/hello_automate_ai/fairseqmmstest/fairseq/examples/mms/asr/infer/mms_infer.py", line 44, in process
with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp6u8grbxl/hypo.word'

shsagnik · 2023-05-22T21:20:13Z

This is after the fix suggested by audiolion

vineelpratap · 2023-05-22T21:23:08Z

@audiolion We expect a 3-digit language code. See 'Supported languages' section in README file for each model.
For example - use 'eng' for English.

vineelpratap · 2023-05-22T21:24:11Z

@shsagnik
No module named 'editdistance' - You should install the missing module.

audiolion · 2023-05-22T21:25:47Z

@shsagnik

ModuleNotFoundError: No module named 'editdistance'

you need to install the modules that are used

shsagnik · 2023-05-22T21:28:27Z

Got these errors this time

preparing tmp manifest dir ...
loading model & running inference ...
/home/hello_automate_ai/miniconda3/lib/python3.10/site-packages/hydra/core/plugins.py:202: UserWarning:
Error importing 'hydra_plugins.hydra_colorlog'.
Plugin is incompatible with this Hydra version or buggy.
Recommended to uninstall or upgrade plugin.
ImportError : cannot import name 'SearchPathPlugin' from 'hydra.plugins' (/home/hello_automate_ai/miniconda3/lib/python3.10/site-packages/hydra/plugins/init.py)
warnings.warn(
Traceback (most recent call last):
File "/home/hello_automate_ai/miniconda3/lib/python3.10/pathlib.py", line 1175, in mkdir
self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/hello_automate_ai/INFER/None'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/hello_automate_ai/miniconda3/lib/python3.10/pathlib.py", line 1175, in mkdir
self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/hello_automate_ai/INFER'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/hello_automate_ai/miniconda3/lib/python3.10/pathlib.py", line 1175, in mkdir
self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/hello_automate_ai'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/hello_automate_ai/fairseqmmstest/fairseq/examples/speech_recognition/new/infer.py", line 499, in
cli_main()
File "/home/hello_automate_ai/fairseqmmstest/fairseq/examples/speech_recognition/new/infer.py", line 495, in cli_main
hydra_main() # pylint: disable=no-value-for-parameter
File "/home/hello_automate_ai/miniconda3/lib/python3.10/site-packages/hydra/main.py", line 32, in decorated_main
_run_hydra(
File "/home/hello_automate_ai/miniconda3/lib/python3.10/site-packages/hydra/_internal/utils.py", line 354, in _run_hydra
run_and_report(
File "/home/hello_automate_ai/miniconda3/lib/python3.10/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
raise ex
File "/home/hello_automate_ai/miniconda3/lib/python3.10/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
return func()
File "/home/hello_automate_ai/miniconda3/lib/python3.10/site-packages/hydra/_internal/utils.py", line 355, in
lambda: hydra.multirun(
File "/home/hello_automate_ai/miniconda3/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 136, in multirun
return sweeper.sweep(arguments=task_overrides)
File "/home/hello_automate_ai/miniconda3/lib/python3.10/site-packages/hydra/_internal/core_plugins/basic_sweeper.py", line 140, in sweep
sweep_dir.mkdir(parents=True, exist_ok=True)
File "/home/hello_automate_ai/miniconda3/lib/python3.10/pathlib.py", line 1179, in mkdir
self.parent.mkdir(parents=True, exist_ok=True)
File "/home/hello_automate_ai/miniconda3/lib/python3.10/pathlib.py", line 1179, in mkdir
self.parent.mkdir(parents=True, exist_ok=True)
File "/home/hello_automate_ai/miniconda3/lib/python3.10/pathlib.py", line 1179, in mkdir
self.parent.mkdir(parents=True, exist_ok=True)
File "/home/hello_automate_ai/miniconda3/lib/python3.10/pathlib.py", line 1175, in mkdir
self._accessor.mkdir(self, mode)
PermissionError: [Errno 13] Permission denied: '/checkpoint'
Traceback (most recent call last):
File "/home/hello_automate_ai/fairseqmmstest/fairseq/examples/mms/asr/infer/mms_infer.py", line 52, in
process(args)
File "/home/hello_automate_ai/fairseqmmstest/fairseq/examples/mms/asr/infer/mms_infer.py", line 44, in process
with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp0mcwde4n/hypo.word'

altryne · 2023-05-22T21:34:55Z

Getting pretty much the same, used the right 3 letter language code (while waiting on #5119 to be answered) and doesn't seem to have an effect, hypo.word error is showing up

dakouan18 · 2023-05-22T21:37:10Z

I got this error when i want to try ASR on google colab

/content/fairseq
>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
Traceback (most recent call last):
  File "/content/fairseq/examples/speech_recognition/new/infer.py", line 21, in <module>
    from examples.speech_recognition.new.decoders.decoder_config import (
  File "/content/fairseq/examples/speech_recognition/__init__.py", line 1, in <module>
    from . import criterions, models, tasks  # noqa
  File "/content/fairseq/examples/speech_recognition/criterions/__init__.py", line 15, in <module>
    importlib.import_module(
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/content/fairseq/examples/speech_recognition/criterions/cross_entropy_acc.py", line 13, in <module>
    from fairseq import utils
  File "/content/fairseq/fairseq/__init__.py", line 20, in <module>
    from fairseq.distributed import utils as distributed_utils
  File "/content/fairseq/fairseq/distributed/__init__.py", line 7, in <module>
    from .fully_sharded_data_parallel import (
  File "/content/fairseq/fairseq/distributed/fully_sharded_data_parallel.py", line 10, in <module>
    from fairseq.dataclass.configs import DistributedTrainingConfig
  File "/content/fairseq/fairseq/dataclass/__init__.py", line 6, in <module>
    from .configs import FairseqDataclass
  File "/content/fairseq/fairseq/dataclass/configs.py", line 12, in <module>
    from omegaconf import II, MISSING
ModuleNotFoundError: No module named 'omegaconf'
CompletedProcess(args='\n        PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path=\'/content/mms1b_fl102.pt\'" task.data=/tmp/tmp79w8mawp dataset.gen_subset="eng:dev" common_eval.post_process=letter decoding.results_path=/tmp/tmp79w8mawp\n        ', returncode=1)
Traceback (most recent call last):
  File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 53, in <module>
    process(args)
  File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 45, in process
    with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp79w8mawp/hypo.word'

audiolion · 2023-05-22T21:40:26Z

Please y'all read the error messages and try to debug yourself.

@dakouan18

ModuleNotFoundError: No module named 'omegaconf'

you need to install the missing modules, one of them being omegaconf

@altryne you need to print the error output to debug

@shsagnik your hydra install has some issues, and you need to specify a checkpoint directory, it was setup to run on linux where you can make directories off the root (probably in a container) so change infer_common.yaml

altryne · 2023-05-22T22:24:34Z

Thanks @audiolion
It wasn't immediately clear that mms_infer.py calls the whole hydra thing via a command, as it obscures the errors that pop up there.

Here's the full output I'm getting (added a print out of the cmd command as well)

$ python examples/mms/asr/infer/mms_infer.py --model mms1b_l1107.pt --audio output_audio.mp3 --lang tur
>>> preparing tmp manifest dir ...

        PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name i
                                                                                                                                           infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='mms1b_l1107.pt'" task.data=C:\Users\micro\AppData\Local\Temp\tmpxzum3zve dataset.gen_subset="tur:dev" common_eval.post_process=letter decoding.results_path=C:\Users\micro\AppData\Local\Temmp\tmpxzum3zve

>>> loading model & running inference ...
Traceback (most recent call last):
  File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 53, in <module>
    process(args)
  File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 45, in process
    with open(tmpdir/"hypo.word") as fr:
         ^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\micro\\AppData\\Local\\Temp\\tmpxzum3zve\\hypo.word'

dakouan18 · 2023-05-22T22:24:35Z

hi @audiolion, after installing omegaconf & hydra a new error appeared

>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
2023-05-22 22:22:29.307454: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-22 22:22:30.440434: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
  File "/content/fairseq/examples/speech_recognition/new/infer.py", line 21, in <module>
    from examples.speech_recognition.new.decoders.decoder_config import (
  File "/content/fairseq/examples/speech_recognition/__init__.py", line 1, in <module>
    from . import criterions, models, tasks  # noqa
  File "/content/fairseq/examples/speech_recognition/criterions/__init__.py", line 15, in <module>
    importlib.import_module(
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/content/fairseq/examples/speech_recognition/criterions/cross_entropy_acc.py", line 13, in <module>
    from fairseq import utils
  File "/content/fairseq/fairseq/__init__.py", line 33, in <module>
    import fairseq.criterions  # noqa
  File "/content/fairseq/fairseq/criterions/__init__.py", line 18, in <module>
    (
TypeError: cannot unpack non-iterable NoneType object
CompletedProcess(args='\n        PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path=\'/content/mms1b_fl102.pt\'" task.data=/tmp/tmpk2ot70rk dataset.gen_subset="eng:dev" common_eval.post_process=letter decoding.results_path=/tmp/tmpk2ot70rk\n        ', returncode=1)
Traceback (most recent call last):
  File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 53, in <module>
    process(args)
  File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 45, in process
    with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpk2ot70rk/hypo.word'

audiolion · 2023-05-22T22:41:44Z

Thanks @audiolion It wasn't immediately clear that mms_infer.py calls the whole hydra thing via a command, as it obscures the errors that pop up there.

Here's the full output I'm getting (added a print out of the cmd command as well)

$ python examples/mms/asr/infer/mms_infer.py --model mms1b_l1107.pt --audio output_audio.mp3 --lang tur
>>> preparing tmp manifest dir ...

        PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name i
                                                                                                                                           infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='mms1b_l1107.pt'" task.data=C:\Users\micro\AppData\Local\Temp\tmpxzum3zve dataset.gen_subset="tur:dev" common_eval.post_process=letter decoding.results_path=C:\Users\micro\AppData\Local\Temmp\tmpxzum3zve

>>> loading model & running inference ...
Traceback (most recent call last):
  File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 53, in <module>
    process(args)
  File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 45, in process
    with open(tmpdir/"hypo.word") as fr:
         ^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\micro\\AppData\\Local\\Temp\\tmpxzum3zve\\hypo.word'

you need to do what I said in my first comment and output the process error message. the hyp.word file is not found because the actual ASR never ran and produced an output

altryne · 2023-05-22T23:19:18Z

SIGH, I am, it prints the command and that's it.

>>> loading model & running inference ...
CompletedProcess(args='\nPYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common                                                                                                                           decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path=\'mms1b_l1107.pt\'" task.data=C:\\Users\\micro\\AppData\\Local\\Temp\\tmp                                                                                                                         p9t2lty3_ dataset.gen_subset="tur:dev" common_eval.post_process=letter decoding.results_path=C:\\Users\\micro\\AppData\\Local\\Temp\\tmp9t2lty3_\n', returncode=0)
Traceback (most recent call last):
  File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 55, in <module>
    process(args)
  File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 47, in process
    with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\micro\\AppData\\Local\\Temp\\tmp9t2lty3_\\hypo.word'

However, when I go back and recreate that temp dir, and run the command manually myself I do seem to get errors.

Just for some reason not via the way you mentioned.

Had to install many packages on the way, here's a partial list (in case it helps anyone)

pip install torch==1.9.0+cu111 -f https://download.pytorch.org/whl/cu111/torch_stable.html
pip install hydra-core
pip install editdistance
pip install soundfile
pip install omegaconf
pip install hydra-core
pip install fairseq
pip install scikit-learn
pip install tensorboardX

Still getting nowhere. Running the subprocess command even with check=True and printing the output returns status code 0 with no errors.

altryne · 2023-05-22T23:50:23Z

Got the model to finally load and run, apparently windows doesn't allow : in directory names and the above code adds :dev to the directory name.

So if you pass --lang tur like I did, it will try to create a directory named /tur:dev inside the /checkpoint which per @audiolion I also had to.. change as /checkpoint doesn't seem to do anything on windows.

I think the full inference ran, as the process got stuck for a few minutes, the GPU went to 8GB (impressive) and after a while, I had 2 errors again.

the hypo.word error seems to be a "catch all" error that means... many things that could go wrong, hopefully the authors will clean it up?

I'm currently staring at this error, and am pretty sure that's due to me removing the : from the dir name

  File "C:\Users\micro\projects\mms\examples\speech_recognition\new\infer.py", line 407, in main
    with InferenceProcessor(cfg) as processor:
  File "C:\Users\micro\projects\mms\examples\speech_recognition\new\infer.py", line 132, in __init__
    self.task.load_dataset(
  File "C:\Users\micro\projects\mms\fairseq\tasks\audio_finetuning.py", line 140, in load_dataset
    super().load_dataset(split, task_cfg, **kwargs)
  File "C:\Users\micro\projects\mms\fairseq\tasks\audio_pretraining.py", line 175, in load_dataset
    for key, file_name in data_keys:
ValueError: not enough values to unpack (expected 2, got 1)

bbz662 · 2023-05-22T23:50:37Z

I had the same error with Google Colab and have investigated.

my error

>>> preparing tmp manifest dir ...

        PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='/content/mms1b_fl102.pt'" task.data=/content/tmp dataset.gen_subset="jpn:dev" common_eval.post_process=letter decoding.results_path=/content/tmp
        
>>> loading model & running inference ...
2023-05-22 22:02:52.055738: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
[2023-05-22 22:02:58,730][HYDRA] Launching 1 jobs locally
[2023-05-22 22:02:58,730][HYDRA] 	#0 : decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 common_eval.path='/content/mms1b_fl102.pt' task.data=/content/tmp dataset.gen_subset=jpn:dev common_eval.post_process=letter decoding.results_path=/content/tmp
[2023-05-22 22:02:59,254][__main__][INFO] - /content/mms1b_fl102.pt
Killed
Traceback (most recent call last):
  File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 54, in <module>
    process(args)
  File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 46, in process
    with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/content/tmp/hypo.word'

As it turns out, it was crashing at the following location.

fairseq/fairseq/models/wav2vec/wav2vec2.py

Line 1052 in af12c9c

self.layers = nn.ModuleList(

Looking at the RAM status, I believe the crash was caused by lack of memory.

So I feel that perhaps increasing the memory will solve the problem.

I hope this helps you in your investigation.

betimd · 2023-05-22T23:50:51Z

Getting same error. Also documentation to run sample is horrible.

audiolion · 2023-05-22T23:53:48Z

I would say it isn't a catch all error, but rather that error handling from the subprocess call is not done, so if the call to run the inference fails for any reason, the hypo.word file will not have been created, and thus the open() call will fail and throw that error. So you have to dig backwards at the subprocess command to find out what happens. This just got open sourced so it makes sense there are some rough edges, contribute back to the repo!

edit: @altryne my bad I thought by your message you were printing the command out itself, not the output of running the command. Your error does look like its failing because of the lack of :. Good news is its open source so you could change : to another character, or run it on windows subsytem linux, or run it in docker.

altryne · 2023-05-23T00:00:38Z

I would say it isn't a catch all error, but rather that error handling from the subprocess call is not done, so if the call to run the inference fails for any reason, the hypo.word file will not have been created, and thus the open() call will fail and throw that error. So you have to dig backwards at the subprocess command to find out what happens. This just got open sourced so it makes sense there are some rough edges, contribute back to the repo!

Yeah, that's what I mean, if anything happens within the subprocess for any reason, folks are going to get the above mentioned error. Then they will likely google their way into this issue, which covers many of the possible ways it can fail.
I was trying to be extra verbose for other folks to potentially help.

edit: @altryne my bad I thought by your message you were printing the command out itself, not the output of running the command. Your error does look like its failing because of the lack of :. Good news is its open source so you could change : to another character, or run it on windows subsytem linux, or run it in docker.

Thanks! You helped a lot, I eventually had to rewrite that whole block like so:

        import os
        os.environ["TMPDIR"] = str(tmpdir)
        os.environ["PYTHONPATH"] = "."
        os.environ["PREFIX"] = "INFER"
        os.environ["HYDRA_FULL_ERROR"] = "1"
        os.environ["USER"] = "micro"

        cmd = f"""python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='{args.model}'" task.data={tmpdir} dataset.gen_subset="{args.lang}" common_eval.post_process={args.format} decoding.results_path={tmpdir}
"""

To even have the command execute and do something and not fail outright.

audiolion · 2023-05-23T00:02:08Z

glad you got it working!

fcecagno · 2023-05-23T02:48:27Z

Hi, thanks for this discussion - I've learned a lot. This is the Dockerfile I created after a few hours trying to make it work:

FROM python:3.8

WORKDIR /usr/src/app

COPY . .

RUN pip install --no-cache-dir . \
 && pip install --no-cache-dir soundfile \
 && pip install --no-cache-dir torch \
 && pip install --no-cache-dir hydra-core \
 && pip install --no-cache-dir editdistance \
 && pip install --no-cache-dir soundfile \
 && pip install --no-cache-dir omegaconf \
 && pip install --no-cache-dir scikit-learn \
 && pip install --no-cache-dir tensorboardX \
 && python setup.py build_ext --inplace \
 && apt update \
 && apt -y install libsndfile-dev \
 && rm -rf /var/lib/apt/lists/* \
 && wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq \
 && chmod +x /usr/bin/yq \
 && yq -i '.common.cpu = true' examples/mms/asr/config/infer_common.yaml

CMD [ "python", "examples/mms/asr/infer/mms_infer.py" ]

I built the image with:

docker build -t fairseq:dev .

And run it with:

docker run --rm -it -e USER=root -v $(pwd):/mms:ro fairseq:dev python examples/mms/asr/infer/mms_infer.py --model /mms/mms1b_fl102.pt --lang eng --audio /mms/audio.wav

MohamedAliRashad · 2023-05-23T03:44:48Z

I kept tracing error and solving them until i met this error:


  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 657, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 556, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1166, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: /usr/local/lib/python3.8/dist-packages/fused_layer_norm_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail14torchCheckFailEPKcS2_jRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Traceback (most recent call last):

Does anyone know a solution ?

didadida-r · 2023-05-23T05:17:13Z

Hi, thanks for this discussion - I've learned a lot. This is the Dockerfile I created after a few hours trying to make it work:

FROM python:3.8

WORKDIR /usr/src/app

COPY . .

RUN pip install --no-cache-dir . \
 && pip install --no-cache-dir soundfile \
 && pip install --no-cache-dir torch \
 && pip install --no-cache-dir hydra-core \
 && pip install --no-cache-dir editdistance \
 && pip install --no-cache-dir soundfile \
 && pip install --no-cache-dir omegaconf \
 && pip install --no-cache-dir scikit-learn \
 && pip install --no-cache-dir tensorboardX \
 && python setup.py build_ext --inplace \
 && apt update \
 && apt -y install libsndfile-dev \
 && rm -rf /var/lib/apt/lists/* \
 && wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq \
 && chmod +x /usr/bin/yq \
 && yq -i '.common.cpu = true' examples/mms/asr/config/infer_common.yaml

CMD [ "python", "examples/mms/asr/infer/mms_infer.py" ]

I built the image with:

docker build -t fairseq:dev .

And run it with:

docker run --rm -it -e USER=root -v $(pwd):/mms:ro fairseq:dev python examples/mms/asr/infer/mms_infer.py --model /mms/mms1b_fl102.pt --lang eng --audio /mms/audio.wav

i run the code based on the docker, but it fails again

>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
Traceback (most recent call last):
  File "examples/speech_recognition/new/infer.py", line 499, in <module>
    cli_main()
  File "examples/speech_recognition/new/infer.py", line 495, in cli_main
    hydra_main()  # pylint: disable=no-value-for-parameter
  File "/usr/local/lib/python3.8/site-packages/hydra/main.py", line 32, in decorated_main
    _run_hydra(
  File "/usr/local/lib/python3.8/site-packages/hydra/_internal/utils.py", line 354, in _run_hydra
    run_and_report(
  File "/usr/local/lib/python3.8/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
    raise ex
  File "/usr/local/lib/python3.8/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
    return func()
  File "/usr/local/lib/python3.8/site-packages/hydra/_internal/utils.py", line 355, in <lambda>
    lambda: hydra.multirun(
  File "/usr/local/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 136, in multirun
    return sweeper.sweep(arguments=task_overrides)
  File "/usr/local/lib/python3.8/site-packages/hydra/_internal/core_plugins/basic_sweeper.py", line 154, in sweep
    results = self.launcher.launch(batch, initial_job_idx=initial_job_idx)
  File "/usr/local/lib/python3.8/site-packages/hydra/_internal/core_plugins/basic_launcher.py", line 76, in launch
    ret = run_job(
  File "/usr/local/lib/python3.8/site-packages/hydra/core/utils.py", line 129, in run_job
    ret.return_value = task_function(task_cfg)
  File "examples/speech_recognition/new/infer.py", line 460, in hydra_main
    distributed_utils.call_main(cfg, main)
  File "/usr/src/app/fairseq/distributed/utils.py", line 404, in call_main
    main(cfg, **kwargs)
  File "examples/speech_recognition/new/infer.py", line 407, in main
    with InferenceProcessor(cfg) as processor:
  File "examples/speech_recognition/new/infer.py", line 132, in __init__
    self.task.load_dataset(
  File "/usr/src/app/fairseq/tasks/audio_finetuning.py", line 140, in load_dataset
    super().load_dataset(split, task_cfg, **kwargs)
  File "/usr/src/app/fairseq/tasks/audio_pretraining.py", line 150, in load_dataset
    if task_cfg.multi_corpus_keys is None:
  File "/usr/local/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 305, in __getattr__
    self._format_and_raise(key=key, value=None, cause=e)
  File "/usr/local/lib/python3.8/site-packages/omegaconf/base.py", line 95, in _format_and_raise
    format_and_raise(
  File "/usr/local/lib/python3.8/site-packages/omegaconf/_utils.py", line 629, in format_and_raise
    _raise(ex, cause)
  File "/usr/local/lib/python3.8/site-packages/omegaconf/_utils.py", line 610, in _raise
    raise ex  # set end OC_CAUSE=1 for full backtrace
  File "/usr/local/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 303, in __getattr__
    return self._get_impl(key=key, default_value=DEFAULT_VALUE_MARKER)
  File "/usr/local/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 361, in _get_impl
    node = self._get_node(key=key)
  File "/usr/local/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 383, in _get_node
    self._validate_get(key)
  File "/usr/local/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 135, in _validate_get
    self._format_and_raise(
  File "/usr/local/lib/python3.8/site-packages/omegaconf/base.py", line 95, in _format_and_raise
    format_and_raise(
  File "/usr/local/lib/python3.8/site-packages/omegaconf/_utils.py", line 694, in format_and_raise
    _raise(ex, cause)
  File "/usr/local/lib/python3.8/site-packages/omegaconf/_utils.py", line 610, in _raise
    raise ex  # set end OC_CAUSE=1 for full backtrace
omegaconf.errors.ConfigAttributeError: Key 'multi_corpus_keys' is not in struct
        full_key: task.multi_corpus_keys
        reference_type=Any
        object_type=dict
Traceback (most recent call last):
  File "examples/mms/asr/infer/mms_infer.py", line 52, in <module>
    process(args)
  File "examples/mms/asr/infer/mms_infer.py", line 44, in process
    with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp4o9kxdyr/hypo.word'

EklavyaFCB · 2023-05-23T09:09:48Z

Same error.

$ python examples/mms/asr/infer/mms_infer.py --model /idiap/temp/esarkar/cache/fairseq/mms1b_all.pt --lang shp --audio /idiap/temp/esarkar/Data/shipibo/downsampled_single_folder/short/shp-ROS-2022-03-14-2.1.wav

>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
Traceback (most recent call last):
  File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/speech_recognition/new/infer.py", line 21, in <module>
    from examples.speech_recognition.new.decoders.decoder_config import (
  File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/speech_recognition/__init__.py", line 1, in <module>
    from . import criterions, models, tasks  # noqa
  File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/speech_recognition/criterions/__init__.py", line 15, in <module>
    importlib.import_module(
  File "/idiap/temp/esarkar/miniconda/envs/fairseq/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/speech_recognition/criterions/cross_entropy_acc.py", line 13, in <module>
    from fairseq import utils
  File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/fairseq/__init__.py", line 33, in <module>
    import fairseq.criterions  # noqa
  File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/fairseq/criterions/__init__.py", line 18, in <module>
    (
TypeError: cannot unpack non-iterable NoneType object
Traceback (most recent call last):
  File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/mms/asr/infer/mms_infer.py", line 52, in <module>
    process(args)
  File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/mms/asr/infer/mms_infer.py", line 44, in process
    with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/idiap/temp/esarkar/tmp/tmpnhi5rrui/hypo.word'

hrishioa · 2023-05-23T10:08:43Z

Same issue.

python examples/mms/asr/infer/mms_infer.py --model "models/mms1b_fl102.pt" --lang eng --audio "../testscripts/audio.wav"
>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
Traceback (most recent call last):
  File "~/fairseq/examples/speech_recognition/new/infer.py", line 21, in <module>
    from examples.speech_recognition.new.decoders.decoder_config import (
  File "~/fairseq/examples/__init__.py", line 7, in <module>
    from fairseq.version import __version__  # noqa
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/fairseq/fairseq/__init__.py", line 20, in <module>
    from fairseq.distributed import utils as distributed_utils
  File "~/fairseq/fairseq/distributed/__init__.py", line 7, in <module>
    from .fully_sharded_data_parallel import (
  File "~/fairseq/fairseq/distributed/fully_sharded_data_parallel.py", line 10, in <module>
    from fairseq.dataclass.configs import DistributedTrainingConfig
  File "~/fairseq/fairseq/dataclass/__init__.py", line 6, in <module>
    from .configs import FairseqDataclass
  File "~/fairseq/fairseq/dataclass/configs.py", line 1127, in <module>
    @dataclass
     ^^^^^^^^^
  File "<location>/opt/anaconda3/envs/mms/lib/python3.11/dataclasses.py", line 1223, in dataclass
    return wrap(cls)
           ^^^^^^^^^
  File "<location>/opt/anaconda3/envs/mms/lib/python3.11/dataclasses.py", line 1213, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<location>/opt/anaconda3/envs/mms/lib/python3.11/dataclasses.py", line 958, in _process_class
    cls_fields.append(_get_field(cls, name, type, kw_only))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<location>/opt/anaconda3/envs/mms/lib/python3.11/dataclasses.py", line 815, in _get_field
    raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'fairseq.dataclass.configs.CommonConfig'> for field common is not allowed: use default_factory
Traceback (most recent call last):
  File "~/fairseq/examples/mms/asr/infer/mms_infer.py", line 52, in <module>
    process(args)
  File "~/fairseq/examples/mms/asr/infer/mms_infer.py", line 44, in process
    with open(tmpdir/"hypo.word") as fr:
         ^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/7r/6k64fzpn6sx5ml6pb2h67kbw0000gn/T/tmp9ubxk363/hypo.word'

MinSukJoshyOh · 2023-05-26T14:31:13Z

Ok for anyone who still have the FileNotFoundError: [Errno 2] No such file or directory for the hypo.word error and just want to test the inference:

Its really what the error says. :D
While inference the programming accesses the tmp folder and need to write some file. Including the hypo.word.
As the error says: in line 44 of the mms_infer.py the it trys to open and write the hypo.word.
with open(tmpdir/"hypo.word") as fr:
as you can see no righs are defined for the open method are defined. so just give python the right to write and read the file.
with open(tmpdir/"hypo.word", "w+") as fr:
this should be all.

you can see in the code

def process(args):    
    with tempfile.TemporaryDirectory() as tmpdir:
        print(">>> preparing tmp manifest dir ...", file=sys.stderr)
        tmpdir = Path("/home/divisio/projects/tmp/")
        with open(tmpdir / "dev.tsv", "w") as fw:
            fw.write("/\n")
            for audio in args.audio:
                nsample = sf.SoundFile(audio).frames
                fw.write(f"{audio}\t{nsample}\n")
        with open(tmpdir / "dev.uid", "w") as fw:
            fw.write(f"{audio}\n"*len(args.audio))
        with open(tmpdir / "dev.ltr", "w") as fw:
            fw.write("d u m m y | d u m m y\n"*len(args.audio))
        with open(tmpdir / "dev.wrd", "w") as fw:
            fw.write("dummy dummy\n"*len(args.audio))
        cmd = f"""
        PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='{args.model}'" task.data={tmpdir} dataset.gen_subset="{args.lang}:dev" common_eval.post_process={args.format} decoding.results_path={tmpdir}
        """
        print(">>> loading model & running inference ...", file=sys.stderr)
        subprocess.run(cmd, shell=True, stdout=subprocess.DEVNULL,)
        with open(tmpdir/"hypo.word", "w+") as fr:
            for ii, hypo in enumerate(fr):
                hypo = re.sub("\(\S+\)$", "", hypo).strip()
                print(f'===============\nInput: {args.audio[ii]}\nOutput: {hypo}')

python should already created tthe files dev.tsv, dev.uid, dev.ltr and dev.wrd in the same tmp folder. If you want to check this, simply change the

tmpdir = Path(tmpdir) in to a static folder for instance in your user directory like
tmpdir = Path("/home/myuser/path/to/my/project/test")

and you will see that those file will bne created. including the hypo.word if you did the changes like I discribed before.

now the the examples/speech_recognition/new/infer.py will be triggerd in line 40.
and it might fail writing down the inference log file. like @v-yunbin discribed
`FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/.../INFER/None'

and its again just a problem with permissions to write some files.
next to the mms_infer.py file is a config folder including a infer_common.yaml and there is the property

hydra:
  run:
    dir: ${common_eval.results_path}/${dataset.gen_subset}
  sweep:
    dir: /checkpoint/${env:USER}/${env:PREFIX}/${common_eval.results_path}
    subdir: ${dataset.gen_subset}

so it trys to write in to the checkpoint folder on root level. If you can not do that. simply change tthis folder my some folder in your user folder:
for instance

hydra:
  run:
    dir: ${common_eval.results_path}/${dataset.gen_subset}
  sweep:
    dir: /home/myuser/my/project/folder/tmp/${env:USER}/${env:PREFIX}/${common_eval.results_path}
    subdir: ${dataset.gen_subset}

so now the script will have access to those folders and whrite down the inferece log (infer.log) in to that folder which includes the result of the ASR.

KyattPL · 2023-05-26T22:39:27Z

I would say it isn't a catch all error, but rather that error handling from the subprocess call is not done, so if the call to run the inference fails for any reason, the hypo.word file will not have been created, and thus the open() call will fail and throw that error. So you have to dig backwards at the subprocess command to find out what happens. This just got open sourced so it makes sense there are some rough edges, contribute back to the repo!

Yeah, that's what I mean, if anything happens within the subprocess for any reason, folks are going to get the above mentioned error. Then they will likely google their way into this issue, which covers many of the possible ways it can fail. I was trying to be extra verbose for other folks to potentially help.

edit: @altryne my bad I thought by your message you were printing the command out itself, not the output of running the command. Your error does look like its failing because of the lack of :. Good news is its open source so you could change : to another character, or run it on windows subsytem linux, or run it in docker.

Thanks! You helped a lot, I eventually had to rewrite that whole block like so:
        import os
        os.environ["TMPDIR"] = str(tmpdir)
        os.environ["PYTHONPATH"] = "."
        os.environ["PREFIX"] = "INFER"
        os.environ["HYDRA_FULL_ERROR"] = "1"
        os.environ["USER"] = "micro"

        cmd = f"""python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='{args.model}'" task.data={tmpdir} dataset.gen_subset="{args.lang}" common_eval.post_process={args.format} decoding.results_path={tmpdir}
"""
To even have the command execute and do something and not fail outright.

I'm pretty sure I made the same changes and I still get the unpack error. I change the ENV vars before the cmd string + copied your entire cmd string. Maybe I'm missing something in infer_common.yaml or with running with args? (Windows paths do be scuffed)

hebochang · 2023-05-28T13:49:17Z

There is a problem with the mms1b_fl102.pt model, and the replacement model is mms1b_all.pt.
I solved this problem.

aberaud · 2023-05-28T16:05:50Z

Not sure what I missed but running this I ran into an error this error. Maybe its a quick permission issue? Apologies I don't work with Docker regularly.

>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
Traceback (most recent call last):
 File "/usr/lib/python3.8/pathlib.py", line 1288, in mkdir
   self._accessor.mkdir(self, mode)
**FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/user/INFER/None'**

During handling of the above exception, another exception occurred:

I edited the script and it's now working for me, with an Ubuntu 22.04 image, tested with both CUDA 11.8 and 12.1.
Note that I added permissions for /checkpoint/${USERNAME}.

Dockerfile.mms:

# Also works with CUDA 12.1:
#FROM nvidia/cuda:12.1.1-cudnn8-runtime-ubuntu22.04
FROM nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04

ENV DEBIAN_FRONTEND=noninteractive
WORKDIR /usr/src/app

RUN apt-get update \
    && apt-get install -y python-is-python3 git python3-pip sudo wget curl

RUN git clone https://github.com/facebookresearch/fairseq.git \
    && cd fairseq \
    && pip install pip -U \
    && pip install --no-cache-dir . \
    && pip install --no-cache-dir soundfile \
    && pip install --no-cache-dir torch \
    && pip install --no-cache-dir hydra-core \
    && pip install --no-cache-dir editdistance \
    && pip install --no-cache-dir soundfile \
    && pip install --no-cache-dir omegaconf \
    && pip install --no-cache-dir scikit-learn \
    && pip install --no-cache-dir tensorboardX \
    && python setup.py build_ext --inplace

ENV USERNAME=user
RUN echo "root:root" | chpasswd \
    && adduser --disabled-password --gecos "" "${USERNAME}" \
    && echo "${USERNAME}:${USERNAME}" | chpasswd \
    && echo "%${USERNAME}    ALL=(ALL)   NOPASSWD:    ALL" >> /etc/sudoers.d/${USERNAME} \
    && chmod 0440 /etc/sudoers.d/${USERNAME}

RUN mkdir -p /checkpoint/${USERNAME}/INFER \
    && chown -R ${USERNAME}:${USERNAME} /checkpoint/${USERNAME}

USER ${USERNAME}
WORKDIR /usr/src/app/fairseq
CMD [ "python", "examples/mms/asr/infer/mms_infer.py" ]

Building with:

docker build -t fairseq:dev -f Dockerfile.mms .

Running with:

docker run --rm -it --gpus all -e USER=user -v $(pwd):/mms:ro fairseq:dev python examples/mms/asr/infer/mms_infer.py --model /mms/examples/mms/mms1b_l1107.pt --lang fra --audio /mms/examples/mms/test16k.wav

bekarys0504 · 2023-05-29T04:49:52Z

Not sure what I missed but running this I ran into an error this error. Maybe its a quick permission issue? Apologies I don't work with Docker regularly.

>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
Traceback (most recent call last):
 File "/usr/lib/python3.8/pathlib.py", line 1288, in mkdir
   self._accessor.mkdir(self, mode)
**FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/user/INFER/None'**

During handling of the above exception, another exception occurred:

I edited the script and it's now working for me, with an Ubuntu 22.04 image, tested with both CUDA 11.8 and 12.1. Note that I added permissions for /checkpoint/${USERNAME}.

Dockerfile.mms:

# Also works with CUDA 12.1:
#FROM nvidia/cuda:12.1.1-cudnn8-runtime-ubuntu22.04
FROM nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04

ENV DEBIAN_FRONTEND=noninteractive
WORKDIR /usr/src/app

RUN apt-get update \
    && apt-get install -y python-is-python3 git python3-pip sudo wget curl

RUN git clone https://github.com/facebookresearch/fairseq.git \
    && cd fairseq \
    && pip install pip -U \
    && pip install --no-cache-dir . \
    && pip install --no-cache-dir soundfile \
    && pip install --no-cache-dir torch \
    && pip install --no-cache-dir hydra-core \
    && pip install --no-cache-dir editdistance \
    && pip install --no-cache-dir soundfile \
    && pip install --no-cache-dir omegaconf \
    && pip install --no-cache-dir scikit-learn \
    && pip install --no-cache-dir tensorboardX \
    && python setup.py build_ext --inplace

ENV USERNAME=user
RUN echo "root:root" | chpasswd \
    && adduser --disabled-password --gecos "" "${USERNAME}" \
    && echo "${USERNAME}:${USERNAME}" | chpasswd \
    && echo "%${USERNAME}    ALL=(ALL)   NOPASSWD:    ALL" >> /etc/sudoers.d/${USERNAME} \
    && chmod 0440 /etc/sudoers.d/${USERNAME}

RUN mkdir -p /checkpoint/${USERNAME}/INFER \
    && chown -R ${USERNAME}:${USERNAME} /checkpoint/${USERNAME}

USER ${USERNAME}
WORKDIR /usr/src/app/fairseq
CMD [ "python", "examples/mms/asr/infer/mms_infer.py" ]

Building with:

docker build -t fairseq:dev -f Dockerfile.mms .

Running with:

docker run --rm -it --gpus all -e USER=user -v $(pwd):/mms:ro fairseq:dev python examples/mms/asr/infer/mms_infer.py --model /mms/examples/mms/mms1b_l1107.pt --lang fra --audio /mms/examples/mms/test16k.wav

Worked for me thanks! For the ones not proficient with docker, just make sure to create on a directory where your docker file is located a directory /examples/mms and place your model and audio files in that directory. What this line $(pwd):/mms:ro does is it mounts the current directory (the present working directory) as a read-only volume inside the container at the path /mms.

abdeladim-s · 2023-05-30T01:09:09Z

Hi all,
If someone is still struggling to run the code, I tried to create a Python package to easily use the MMS project, instead of calling subprocess and dealing with yaml files.
`Hope it will be useful! :)

bekarys0504 · 2023-05-30T03:29:29Z

Hi all,
If someone is still struggling to run the code, I tried to create a Python package to easily use the MMS project, instead of calling subprocess and dealing with yaml files.
`Hope it will be useful! :)

I get the following error after following all the steps:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/alignment.py:16
     15 try:
---> 16     from fairseq.examples.mms.data_prep.align_and_segment import get_alignments
     17     from fairseq.examples.mms.data_prep.align_utils import get_uroman_tokens, get_spans

ModuleNotFoundError: No module named 'fairseq.examples.mms'

During handling of the above exception, another exception occurred:

ModuleNotFoundError                       Traceback (most recent call last)
File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/alignment.py:21
     20 try:
---> 21     from examples.mms.data_prep.align_and_segment import get_alignments
     22     from examples.mms.data_prep.align_utils import get_uroman_tokens, get_spans

ModuleNotFoundError: No module named 'examples.mms'

During handling of the above exception, another exception occurred:

ModuleNotFoundError                       Traceback (most recent call last)
Cell In[6], line 1
----> 1 from easymms.models.asr import ASRModel
      3 asr = ASRModel(model='/bekarys/fairseq/models/mms1b_fl102.pt')
      4 files = val_data_annotated.audio_path.to_list()[:2]

File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/asr.py:37
     35 from easymms import utils
     36 from easymms._logger import set_log_level
---> 37 from easymms.models.alignment import AlignmentModel
     38 from easymms.constants import CFG, HYPO_WORDS_FILE, MMS_LANGS_FILE
     40 logger = logging.getLogger(__name__)

File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/alignment.py:27
     25 import fairseq
     26 sys.path.append(str(Path(fairseq.__file__).parent))
---> 27 from fairseq.examples.mms.data_prep.align_and_segment import get_alignments
     28 from fairseq.examples.mms.data_prep.align_utils import get_uroman_tokens, get_spans
     29 from fairseq.examples.mms.data_prep.text_normalization import text_normalize

ModuleNotFoundError: No module named 'fairseq.examples.mms'

abdeladim-s · 2023-05-30T04:02:25Z

Hi all,
If someone is still struggling to run the code, I tried to create a Python package to easily use the MMS project, instead of calling subprocess and dealing with yaml files.
`Hope it will be useful! :)

I get the following error after following all the steps:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/alignment.py:16
     15 try:
---> 16     from fairseq.examples.mms.data_prep.align_and_segment import get_alignments
     17     from fairseq.examples.mms.data_prep.align_utils import get_uroman_tokens, get_spans

ModuleNotFoundError: No module named 'fairseq.examples.mms'

During handling of the above exception, another exception occurred:

ModuleNotFoundError                       Traceback (most recent call last)
File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/alignment.py:21
     20 try:
---> 21     from examples.mms.data_prep.align_and_segment import get_alignments
     22     from examples.mms.data_prep.align_utils import get_uroman_tokens, get_spans

ModuleNotFoundError: No module named 'examples.mms'

During handling of the above exception, another exception occurred:

ModuleNotFoundError                       Traceback (most recent call last)
Cell In[6], line 1
----> 1 from easymms.models.asr import ASRModel
      3 asr = ASRModel(model='/bekarys/fairseq/models/mms1b_fl102.pt')
      4 files = val_data_annotated.audio_path.to_list()[:2]

File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/asr.py:37
     35 from easymms import utils
     36 from easymms._logger import set_log_level
---> 37 from easymms.models.alignment import AlignmentModel
     38 from easymms.constants import CFG, HYPO_WORDS_FILE, MMS_LANGS_FILE
     40 logger = logging.getLogger(__name__)

File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/alignment.py:27
     25 import fairseq
     26 sys.path.append(str(Path(fairseq.__file__).parent))
---> 27 from fairseq.examples.mms.data_prep.align_and_segment import get_alignments
     28 from fairseq.examples.mms.data_prep.align_utils import get_uroman_tokens, get_spans
     29 from fairseq.examples.mms.data_prep.text_normalization import text_normalize

ModuleNotFoundError: No module named 'fairseq.examples.mms'

I just noticed that the MMS project is not included yet in the released version of fairseq, so you will need to install it from source until then:

pip uninstall fairseq && pip install git+https://github.com/facebookresearch/fairseq

The installation steps are updated accordingly.
Let me know @bekarys0504 if that solved the issue ?

bekarys0504 · 2023-05-30T04:32:41Z

Hi all,
If someone is still struggling to run the code, I tried to create a Python package to easily use the MMS project, instead of calling subprocess and dealing with yaml files.
`Hope it will be useful! :)

I get the following error after following all the steps:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/alignment.py:16
     15 try:
---> 16     from fairseq.examples.mms.data_prep.align_and_segment import get_alignments
     17     from fairseq.examples.mms.data_prep.align_utils import get_uroman_tokens, get_spans

ModuleNotFoundError: No module named 'fairseq.examples.mms'

During handling of the above exception, another exception occurred:

ModuleNotFoundError                       Traceback (most recent call last)
File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/alignment.py:21
     20 try:
---> 21     from examples.mms.data_prep.align_and_segment import get_alignments
     22     from examples.mms.data_prep.align_utils import get_uroman_tokens, get_spans

ModuleNotFoundError: No module named 'examples.mms'

During handling of the above exception, another exception occurred:

ModuleNotFoundError                       Traceback (most recent call last)
Cell In[6], line 1
----> 1 from easymms.models.asr import ASRModel
      3 asr = ASRModel(model='/bekarys/fairseq/models/mms1b_fl102.pt')
      4 files = val_data_annotated.audio_path.to_list()[:2]

File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/asr.py:37
     35 from easymms import utils
     36 from easymms._logger import set_log_level
---> 37 from easymms.models.alignment import AlignmentModel
     38 from easymms.constants import CFG, HYPO_WORDS_FILE, MMS_LANGS_FILE
     40 logger = logging.getLogger(__name__)

File /scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/alignment.py:27
     25 import fairseq
     26 sys.path.append(str(Path(fairseq.__file__).parent))
---> 27 from fairseq.examples.mms.data_prep.align_and_segment import get_alignments
     28 from fairseq.examples.mms.data_prep.align_utils import get_uroman_tokens, get_spans
     29 from fairseq.examples.mms.data_prep.text_normalization import text_normalize

ModuleNotFoundError: No module named 'fairseq.examples.mms'

I just noticed that the MMS project is not included yet in the released version of fairseq, so you will need to install it from source until then:

pip uninstall fairseq && pip install git+https://github.com/facebookresearch/fairseq

The installation steps are updated accordingly. Let me know @bekarys0504 if that solved the issue ?

I have the following error now :( @abdeladim-s

Traceback (most recent call last):
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3505, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/tmp/ipykernel_2570058/32768016.py", line 6, in <module>
    transcriptions = asr.transcribe(files, lang='kaz', align=False)
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/easymms/models/asr.py", line 170, in transcribe
    self.wer = hydra_main(cfg)
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/hydra/main.py", line 27, in decorated_main
    return task_function(cfg_passthrough)
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/fairseq/examples/speech_recognition/new/infer.py", line 436, in hydra_main
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/fairseq/distributed/utils.py", line 369, in call_main
    if cfg.distributed_training.distributed_init_method is None:
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/fairseq/examples/speech_recognition/new/infer.py", line 383, in main
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/fairseq/examples/speech_recognition/new/infer.py", line 103, in __init__
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/fairseq/examples/speech_recognition/new/infer.py", line 205, in load_model_ensemble
    out_file.write(line)
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/fairseq/checkpoint_utils.py", line 367, in load_model_ensemble
    arg_overrides (Dict[str,Any], optional): override model args that
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/fairseq/checkpoint_utils.py", line 482, in load_model_ensemble_and_task
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/fairseq/models/fairseq_model.py", line 128, in load_state_dict
    return super().load_state_dict(new_state_dict, strict)
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2056, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Wav2VecCtc:
	Unexpected key(s) in state_dict: "w2v_encoder.w2v_model.encoder.layers.0.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.0.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.0.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.0.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.0.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.0.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.1.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.1.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.1.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.1.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.1.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.1.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.2.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.2.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.2.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.2.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.2.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.2.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.3.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.3.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.3.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.3.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.3.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.3.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.4.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.4.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.4.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.4.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.4.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.4.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.5.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.5.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.5.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.5.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.5.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.5.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.6.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.6.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.6.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.6.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.6.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.6.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.7.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.7.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.7.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.7.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.7.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.7.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.8.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.8.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.8.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.8.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.8.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.8.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.9.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.9.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.9.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.9.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.9.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.9.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.10.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.10.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.10.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.10.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.10.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.10.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.11.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.11.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.11.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.11.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.11.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.11.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.12.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.12.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.12.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.12.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.12.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.12.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.13.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.13.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.13.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.13.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.13.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.13.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.14.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.14.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.14.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.14.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.14.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.14.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.15.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.15.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.15.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.15.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.15.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.15.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.16.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.16.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.16.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.16.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.16.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.16.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.17.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.17.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.17.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.17.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.17.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.17.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.18.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.18.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.18.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.18.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.18.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.18.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.19.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.19.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.19.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.19.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.19.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.19.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.20.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.20.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.20.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.20.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.20.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.20.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.21.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.21.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.21.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.21.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.21.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.21.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.22.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.22.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.22.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.22.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.22.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.22.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.23.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.23.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.23.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.23.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.23.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.23.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.24.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.24.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.24.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.24.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.24.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.24.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.25.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.25.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.25.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.25.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.25.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.25.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.26.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.26.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.26.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.26.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.26.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.26.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.27.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.27.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.27.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.27.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.27.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.27.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.28.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.28.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.28.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.28.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.28.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.28.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.29.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.29.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.29.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.29.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.29.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.29.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.30.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.30.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.30.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.30.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.30.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.30.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.31.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.31.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.31.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.31.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.31.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.31.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.32.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.32.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.32.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.32.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.32.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.32.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.33.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.33.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.33.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.33.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.33.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.33.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.34.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.34.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.34.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.34.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.34.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.34.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.35.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.35.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.35.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.35.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.35.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.35.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.36.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.36.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.36.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.36.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.36.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.36.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.37.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.37.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.37.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.37.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.37.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.37.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.38.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.38.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.38.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.38.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.38.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.38.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.39.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.39.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.39.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.39.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.39.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.39.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.40.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.40.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.40.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.40.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.40.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.40.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.41.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.41.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.41.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.41.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.41.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.41.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.42.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.42.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.42.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.42.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.42.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.42.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.43.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.43.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.43.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.43.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.43.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.43.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.44.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.44.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.44.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.44.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.44.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.44.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.45.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.45.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.45.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.45.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.45.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.45.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.46.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.46.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.46.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.46.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.46.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.46.adapter_layer.ln_b", "w2v_encoder.w2v_model.encoder.layers.47.adapter_layer.W_a", "w2v_encoder.w2v_model.encoder.layers.47.adapter_layer.W_b", "w2v_encoder.w2v_model.encoder.layers.47.adapter_layer.b_a", "w2v_encoder.w2v_model.encoder.layers.47.adapter_layer.b_b", "w2v_encoder.w2v_model.encoder.layers.47.adapter_layer.ln_W", "w2v_encoder.w2v_model.encoder.layers.47.adapter_layer.ln_b". 

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 2102, in showtraceback
    stb = self.InteractiveTB.structured_traceback(
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/IPython/core/ultratb.py", line 1310, in structured_traceback
    return FormattedTB.structured_traceback(
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/IPython/core/ultratb.py", line 1199, in structured_traceback
    return VerboseTB.structured_traceback(
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/IPython/core/ultratb.py", line 1052, in structured_traceback
    formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context,
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/IPython/core/ultratb.py", line 978, in format_exception_as_a_whole
    frames.append(self.format_record(record))
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/IPython/core/ultratb.py", line 878, in format_record
    frame_info.lines, Colors, self.has_colors, lvals
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/IPython/core/ultratb.py", line 712, in lines
    return self._sd.lines
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper
    value = obj.__dict__[self.func.__name__] = self.func(obj)
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/stack_data/core.py", line 734, in lines
    pieces = self.included_pieces
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper
    value = obj.__dict__[self.func.__name__] = self.func(obj)
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/stack_data/core.py", line 681, in included_pieces
    pos = scope_pieces.index(self.executing_piece)
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper
    value = obj.__dict__[self.func.__name__] = self.func(obj)
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/stack_data/core.py", line 660, in executing_piece
    return only(
  File "/scriptur/nemo_asr/env/lib/python3.8/site-packages/executing/executing.py", line 190, in only
    raise NotOneValueFound('Expected one value, found 0')
executing.executing.NotOneValueFound: Expected one value, found 0

abdeladim-s · 2023-05-30T04:44:07Z

@bekarys0504, what model are you using ? I think you are using a wrong model!

bekarys0504 · 2023-05-30T04:47:00Z

@bekarys0504, what model are you using ? I think you are using a wrong model!

this one mms1b_fl102.pt downloaded through this link https://dl.fbaipublicfiles.com/mms/asr/mms1b_fl102.pt

Should be the right one it is for ASR @abdeladim-s

abdeladim-s · 2023-05-30T05:08:38Z

@bekarys0504, what model are you using ? I think you are using a wrong model!

this one mms1b_fl102.pt downloaded through this link https://dl.fbaipublicfiles.com/mms/asr/mms1b_fl102.pt

Should be the right one it is for ASR @abdeladim-s

@bekarys0504 , yes it is a right model it seems.
Could you please submit an issue on the project repo so we can debug this issue further together ?

andergisomon · 2023-06-01T19:42:28Z

Ok for anyone who still have the FileNotFoundError: [Errno 2] No such file or directory for the hypo.word error and just want to test the inference:

Its really what the error says. :D While inference the programming accesses the tmp folder and need to write some file. Including the hypo.word. As the error says: in line 44 of the mms_infer.py the it trys to open and write the hypo.word. with open(tmpdir/"hypo.word") as fr: as you can see no righs are defined for the open method are defined. so just give python the right to write and read the file. with open(tmpdir/"hypo.word", "w+") as fr: this should be all.

you can see in the code
def process(args):    
    with tempfile.TemporaryDirectory() as tmpdir:
        print(">>> preparing tmp manifest dir ...", file=sys.stderr)
        tmpdir = Path("/home/divisio/projects/tmp/")
        with open(tmpdir / "dev.tsv", "w") as fw:
            fw.write("/\n")
            for audio in args.audio:
                nsample = sf.SoundFile(audio).frames
                fw.write(f"{audio}\t{nsample}\n")
        with open(tmpdir / "dev.uid", "w") as fw:
            fw.write(f"{audio}\n"*len(args.audio))
        with open(tmpdir / "dev.ltr", "w") as fw:
            fw.write("d u m m y | d u m m y\n"*len(args.audio))
        with open(tmpdir / "dev.wrd", "w") as fw:
            fw.write("dummy dummy\n"*len(args.audio))
        cmd = f"""
        PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='{args.model}'" task.data={tmpdir} dataset.gen_subset="{args.lang}:dev" common_eval.post_process={args.format} decoding.results_path={tmpdir}
        """
        print(">>> loading model & running inference ...", file=sys.stderr)
        subprocess.run(cmd, shell=True, stdout=subprocess.DEVNULL,)
        with open(tmpdir/"hypo.word", "w+") as fr:
            for ii, hypo in enumerate(fr):
                hypo = re.sub("$\S+$$", "", hypo).strip()
                print(f'===============\nInput: {args.audio[ii]}\nOutput: {hypo}')
python should already created tthe files dev.tsv, dev.uid, dev.ltr and dev.wrd in the same tmp folder. If you want to check this, simply change the

tmpdir = Path(tmpdir) in to a static folder for instance in your user directory like tmpdir = Path("/home/myuser/path/to/my/project/test")

and you will see that those file will bne created. including the hypo.word if you did the changes like I discribed before.

now the the examples/speech_recognition/new/infer.py will be triggerd in line 40. and it might fail writing down the inference log file. like @v-yunbin discribed `FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/.../INFER/None'

and its again just a problem with permissions to write some files. next to the mms_infer.py file is a config folder including a infer_common.yaml and there is the property
hydra:
  run:
    dir: ${common_eval.results_path}/${dataset.gen_subset}
  sweep:
    dir: /checkpoint/${env:USER}/${env:PREFIX}/${common_eval.results_path}
    subdir: ${dataset.gen_subset}
so it trys to write in to the checkpoint folder on root level. If you can not do that. simply change tthis folder my some folder in your user folder: for instance
hydra:
  run:
    dir: ${common_eval.results_path}/${dataset.gen_subset}
  sweep:
    dir: /home/myuser/my/project/folder/tmp/${env:USER}/${env:PREFIX}/${common_eval.results_path}
    subdir: ${dataset.gen_subset}
so now the script will have access to those folders and whrite down the inferece log (infer.log) in to that folder which includes the result of the ASR.

I did what you described and while it ran for 6 minutes, I get a "Killed" in the output with no other information. The RAM was basically maxed out throughout and there was no hypo.word not found error. The model is probably just too big to run on free Colab.

patrickvonplaten · 2023-06-02T11:21:38Z

BTW, it should now be very simple to use MMS with transformers:

See:

andergisomon · 2023-06-02T12:12:35Z

How much resources does it really take to run the l1107 model anyways? Because running it on colab maxed out 12GB of system RAM. Feels really overkill for a 10 second audio input.

patrickvonplaten · 2023-06-02T16:04:49Z

It takes less than 8GB with the code snippet of https://huggingface.co/facebook/mms-1b-all and can easily be run on CPU - give it a try ;-)

andergisomon · 2023-06-02T16:10:47Z

It takes less than 8GB with the code snippet of https://huggingface.co/facebook/mms-1b-all and can easily be run on CPU - give it a try ;-)

That's good to know B), because even after tweaking the line where asr.py was supposed to do stuff to hypo.word it ran on Colab but after 6 minutes of maxing out the 12GB of RAM it was killed. The audio file wasn't even long, it was less than 10 seconds long.

By the way I have yet to try it using 🤗 transformers, I'm referring to the colab notebook demoing ASR that's having trouble running.

patrickvonplaten · 2023-06-02T16:19:23Z

Here we go: https://colab.research.google.com/drive/1jqREwuNUn0SrzcVjh90JSLleSVEcx1BY?usp=sharing simple 4 cell colab

bagustris · 2023-06-14T07:24:34Z

I also found this error "hypo.word" on another machine (Ubuntu 20.04) while there is no problem in the other (Ubuntu 22.04). Actually there is an error before No such file or directory: /tmp/hypo.word:

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

After updating Numpy (from 1.21.5 to 1.24.3) the error was gone and the the output of ASR is shown in the bottom.

bmox · 2023-06-27T14:50:09Z

missing modules, one of them being omegaconf

@altryne you need to print the error output to debug

Yes you are right. Smaller model is working 😓

HironTez · 2023-07-04T09:16:03Z

To fix this issue, add open(tmpdir/"hypo.word", 'w').close() before the line 48 in "fairseq\examples\mms\asr\infer\mms_infer.py"

Jackylee2032 · 2023-07-07T07:42:36Z

what files need to be changed in windows

jackylee1 · 2023-07-07T16:02:23Z

BTW, it should now be very simple to use MMS with transformers:

See:

https://huggingface.co/docs/transformers/main/en/model_doc/mms

[MMS] Scaling Speech Technology to 1,000+ Languages | Add attention adapter to Wav2Vec2 huggingface/transformers#23813

https://huggingface.co/facebook/mms-1b-all

your project is perfect,but i want to know how to use my own voice to translate

SalmaZakaria · 2023-07-17T17:51:57Z

Please y'all read the error messages and try to debug yourself.

@dakouan18

ModuleNotFoundError: No module named 'omegaconf'

you need to install the missing modules, one of them being omegaconf

@altryne you need to print the error output to debug

@shsagnik your hydra install has some issues, and you need to specify a checkpoint directory, it was setup to run on linux where you can make directories off the root (probably in a container) so change infer_common.yaml

I have the same error as @shsagnik
What should I do ? I ran on ubuntu

spanta28 · 2023-10-17T16:39:15Z

Sorry to bother you here. I am unable to run mms asr transcribe . I am using python3.11 and facing a range of issues from hypo.word not found, AttributeError: 'PosixPath' object has no attribute 'find' and what not.

Going through the issues, there is no landing solutions except lot of comments..
#5284 (already tried that solution and lead to my error posted below)
#5117 (has no solution)

There are just way too many threads relating to mms asr transcribe issues but no working solutions posted, if there is one set of installation instructions that actually work and are documented somewhere that be great.

Here is my error:

os.environ["TMPDIR"] ='/Users/spanta/Downloads/fairseq-main/temp_dir'

os.environ["PYTHONPATH"] = "."

os.environ["PREFIX"] = "INFER"

os.environ["HYDRA_FULL_ERROR"] = "1"

os.environ["USER"] = "micro"

os.system('python3.11 examples/mms/asr/infer/mms_infer.py --model "/Users/spanta/Downloads/fairseq/models_new/mms1b_fl102.pt" --lang "tel" --audio "/Users/spanta/Documents/test_wav/1.wav"')

preparing tmp manifest dir ...

loading model & running inference ...

/Users/spanta/Downloads/fairseq-main/examples/speech_recognition/new/infer.py:440: UserWarning:

The version_base parameter is not specified.

Please specify a compatability version level, or None.

Will assume defaults for version 1.1

@hydra.main(config_path=config_path, config_name="infer")

Traceback (most recent call last):

File "/Users/spanta/Downloads/fairseq-main/examples/speech_recognition/new/infer.py", line 499, in

cli_main()

File "/Users/spanta/Downloads/fairseq-main/examples/speech_recognition/new/infer.py", line 495, in cli_main

hydra_main()  # pylint: disable=no-value-for-parameter

^^^^^^^^^^^^

File "/opt/homebrew/lib/python3.11/site-packages/hydra/main.py", line 94, in decorated_main

_run_hydra(

File "/opt/homebrew/lib/python3.11/site-packages/hydra/_internal/utils.py", line 355, in _run_hydra

hydra = run_and_report(

        ^^^^^^^^^^^^^^^

File "/opt/homebrew/lib/python3.11/site-packages/hydra/_internal/utils.py", line 223, in run_and_report

raise ex

File "/opt/homebrew/lib/python3.11/site-packages/hydra/_internal/utils.py", line 220, in run_and_report

return func()

       ^^^^^^

File "/opt/homebrew/lib/python3.11/site-packages/hydra/_internal/utils.py", line 356, in

lambda: Hydra.create_main_hydra2(

        ^^^^^^^^^^^^^^^^^^^^^^^^^

File "/opt/homebrew/lib/python3.11/site-packages/hydra/_internal/hydra.py", line 61, in create_main_hydra2

config_loader: ConfigLoader = ConfigLoaderImpl(

                              ^^^^^^^^^^^^^^^^^

File "/opt/homebrew/lib/python3.11/site-packages/hydra/_internal/config_loader_impl.py", line 48, in init

self.repository = ConfigRepository(config_search_path=config_search_path)

                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/opt/homebrew/lib/python3.11/site-packages/hydra/_internal/config_repository.py", line 65, in init

self.initialize_sources(config_search_path)

File "/opt/homebrew/lib/python3.11/site-packages/hydra/_internal/config_repository.py", line 72, in initialize_sources

scheme = self._get_scheme(search_path.path)

         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/opt/homebrew/lib/python3.11/site-packages/hydra/_internal/config_repository.py", line 143, in _get_scheme

idx = path.find("://")

      ^^^^^^^^^

AttributeError: 'PosixPath' object has no attribute 'find'

0

didi222-lqq · 2024-03-10T13:35:07Z

Thanks @audiolion It wasn't immediately clear that mms_infer.py calls the whole hydra thing via a command, as it obscures the errors that pop up there.
Here's the full output I'm getting (added a print out of the cmd command as well)

$ python examples/mms/asr/infer/mms_infer.py --model mms1b_l1107.pt --audio output_audio.mp3 --lang tur
>>> preparing tmp manifest dir ...

        PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name i
                                                                                                                                           infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='mms1b_l1107.pt'" task.data=C:\Users\micro\AppData\Local\Temp\tmpxzum3zve dataset.gen_subset="tur:dev" common_eval.post_process=letter decoding.results_path=C:\Users\micro\AppData\Local\Temmp\tmpxzum3zve

>>> loading model & running inference ...
Traceback (most recent call last):
  File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 53, in <module>
    process(args)
  File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 45, in process
    with open(tmpdir/"hypo.word") as fr:
         ^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\micro\\AppData\\Local\\Temp\\tmpxzum3zve\\hypo.word'

you need to do what I said in my first comment and output the process error message. the hyp.word file is not found because the actual ASR never ran and produced an output

Hello, I output the error message according to your comment, and it printed the following error
“CompletedProcess(args='\n PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=1440000 distributed_training.distributed_world_size=1 "common_eval.path='./models_new/mms1b_all.pt'" task.data=/tmp/tmpepozridd dataset.gen_subset="adx:dev" common_eval.post_process=letter decoding.results_path=/tmp/tmpepozridd \n ', returncode=1)”
The complete log is as follows：

didi222-lqq · 2024-03-10T14:02:15Z

After series of tries, i was able to get it to infer on Linux but it could probably work on Windows also. The hypo.word file missing error is due to exceptions thrown during subprocess.run(cmd, shell=True, stdout=subprocess.DEVNULL,) so first i suggest you replace that line with the following:
out = subprocess.run(cmd, check=True, shell=True, stdout=subprocess.DEVNULL, )
print(out)
This will enable you see whats causing the error. Also provide the full paths of your model and audio files like this python examples/mms/asr/infer/mms_infer.py --model "/home/hunter/Downloads/mms1b_all.pt" --lang eng --audio "/home/hunter/Downloads/audio.wav"

After I replaced the code, the error message output is as follows：

CompletedProcess(args='\n        PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=1440000 distributed_training.distributed_world_size=1 "common_eval.path=\'models_new/mms1b_all.pt\'" task.data=/tmp/tmpuarv2nsi dataset.gen_subset="adx:dev" common_eval.post_process=letter decoding.results_path=/tmp/tmpuarv2nsi \n        ', returncode=1)

The complete log is as follows：

ahazeemi added needs triage question labels May 22, 2023

davidvisleer mentioned this issue May 30, 2023

when use asr infer it get something wrong #5172

Closed

marosjev-cde mentioned this issue Aug 16, 2023

Cannot run mms_infer.py for transcription #5284

Closed

spanta28 mentioned this issue Oct 17, 2023

mms asr transcribe issues no solution #5348

Open

hypo.word file missing during MMS ASR inference #5117

hypo.word file missing during MMS ASR inference #5117

Comments

ahazeemi commented May 22, 2023

❓ Questions and Help

What is your question?

Code

What have you tried?

What's your environment?

shsagnik commented May 22, 2023 • edited

vineelpratap commented May 22, 2023

audiolion commented May 22, 2023

audiolion commented May 22, 2023

shsagnik commented May 22, 2023 • edited

shsagnik commented May 22, 2023

vineelpratap commented May 22, 2023

vineelpratap commented May 22, 2023

audiolion commented May 22, 2023

shsagnik commented May 22, 2023 • edited

altryne commented May 22, 2023

dakouan18 commented May 22, 2023 • edited

audiolion commented May 22, 2023

altryne commented May 22, 2023

dakouan18 commented May 22, 2023

audiolion commented May 22, 2023

altryne commented May 22, 2023

altryne commented May 22, 2023

bbz662 commented May 22, 2023

betimd commented May 22, 2023

audiolion commented May 22, 2023 • edited

altryne commented May 23, 2023

audiolion commented May 23, 2023

fcecagno commented May 23, 2023 • edited

MohamedAliRashad commented May 23, 2023

didadida-r commented May 23, 2023 • edited

EklavyaFCB commented May 23, 2023 • edited

hrishioa commented May 23, 2023

MinSukJoshyOh commented May 26, 2023

KyattPL commented May 26, 2023

hebochang commented May 28, 2023

aberaud commented May 28, 2023 • edited

bekarys0504 commented May 29, 2023 • edited

abdeladim-s commented May 30, 2023

bekarys0504 commented May 30, 2023 • edited

abdeladim-s commented May 30, 2023

bekarys0504 commented May 30, 2023 • edited

abdeladim-s commented May 30, 2023

bekarys0504 commented May 30, 2023 • edited

abdeladim-s commented May 30, 2023

andergisomon commented Jun 1, 2023

patrickvonplaten commented Jun 2, 2023

andergisomon commented Jun 2, 2023

patrickvonplaten commented Jun 2, 2023

andergisomon commented Jun 2, 2023 • edited

patrickvonplaten commented Jun 2, 2023

bagustris commented Jun 14, 2023

bmox commented Jun 27, 2023

HironTez commented Jul 4, 2023 • edited

Jackylee2032 commented Jul 7, 2023

jackylee1 commented Jul 7, 2023

SalmaZakaria commented Jul 17, 2023

spanta28 commented Oct 17, 2023

didi222-lqq commented Mar 10, 2024

didi222-lqq commented Mar 10, 2024

shsagnik commented May 22, 2023 •

edited

shsagnik commented May 22, 2023 •

edited

shsagnik commented May 22, 2023 •

edited

dakouan18 commented May 22, 2023 •

edited

audiolion commented May 22, 2023 •

edited

fcecagno commented May 23, 2023 •

edited

didadida-r commented May 23, 2023 •

edited

EklavyaFCB commented May 23, 2023 •

edited

aberaud commented May 28, 2023 •

edited

bekarys0504 commented May 29, 2023 •

edited

bekarys0504 commented May 30, 2023 •

edited

bekarys0504 commented May 30, 2023 •

edited

bekarys0504 commented May 30, 2023 •

edited

andergisomon commented Jun 2, 2023 •

edited

HironTez commented Jul 4, 2023 •

edited