Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transcribe can't find files outside current script working directory #805

Open
NeoFahrenheit opened this issue Apr 24, 2024 · 4 comments
Open

Comments

@NeoFahrenheit
Copy link

Hi, I'm on a mac and I am trying to transcibe a audio file, extracted with yt_dlp. The problem is WhisperModel can't find or correctly process the audio files outside the code working directory.

def process_audios(self) -> bool:
        exts = ['*.m4a', '*.mp3', '*.wav', '*.flac', '*.mp4', '*.wma', '*.aac', '*.ogg']

        print(os.listdir(self.audio_path))
        # ['Tutorial-Master Text Similarity Search with Python & FAISS Vector Database.m4a', 'g30 4.m4a']

        for filename in os.listdir(self.audio_path):
            if any(fnmatch.fnmatch(filename, extension) for extension in exts):
                cur_file = os.path.join(self.audio_path, filename)  # Absolute path
                filename_extensionless = os.path.splitext(filename)[0]
                print('cur_file is: ', cur_file) # /Users/lmonteir/.HandySpeechBot/projects/project_name/audios/Tutorial-Master Text Similarity Search with Python & FAISS Vector Database.m4a
                print('is valid: ', os.path.isfile(cur_file))   # It says True

                model = WhisperModel(model_size_or_path=self.app_data['user_config']['model'],
                                     cpu_threads=self.app_data['user_config']['cpu_threads'],
                                     download_root=self.models_path)
                segments, info = model.transcribe(cur_file) # Error happens here.

This is the error stack:

Traceback (most recent call last):
  File "/Users/lmonteir/Projects/handy_speech_bot/DataManager/project_manager.py", line 139, in <module>
    m.process_audios()
  File "/Users/lmonteir/Projects/handy_speech_bot/DataManager/project_manager.py", line 97, in process_audios
    segments, info = model.transcribe(cur_file)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/lmonteir/Projects/handy_speech_bot/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 294, in transcribe
    audio = decode_audio(audio, sampling_rate=sampling_rate)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/lmonteir/Projects/handy_speech_bot/lib/python3.12/site-packages/faster_whisper/audio.py", line 52, in decode_audio
    for frame in frames:
  File "/Users/lmonteir/Projects/handy_speech_bot/lib/python3.12/site-packages/faster_whisper/audio.py", line 103, in _resample_frames
    for frame in itertools.chain(frames, [None]):
  File "/Users/lmonteir/Projects/handy_speech_bot/lib/python3.12/site-packages/faster_whisper/audio.py", line 92, in _group_frames
    fifo.write(frame)
  File "av/audio/fifo.pyx", line 30, in av.audio.fifo.AudioFifo.write
  File "av/audio/fifo.pyx", line 74, in av.audio.fifo.AudioFifo.write
RuntimeError: Could not allocate AVAudioFifo.

Now, if I put the files in the current script folder, it runs fine.
I have tried putting double quotes between the filename and the absolute path, but I didn't work.
Anything that I might be missing?

@Purfview
Copy link
Contributor

Make sure you are using the last faster-whisper version.
Check what PyAV version is there too.

@NeoFahrenheit
Copy link
Author

faster-whisper is on 1.0.1.
Couldn't find a package named PyAV. I installed the version 12.0.5.
Problem persists.

Let me know if you need more information. :)
Thanks for the help!

@Purfview
Copy link
Contributor

Try to downgrade it, I don't have other ideas...

pip install --force-reinstall av==11.0.0

@NeoFahrenheit
Copy link
Author

NeoFahrenheit commented Apr 25, 2024

Try to downgrade it, I don't have other ideas...

pip install --force-reinstall av==11.0.0

It didn't work.
What I tried was to use those generic audio converter websites to convert my .m4a to .mp3 and it worked nicely!

Now, this is what I dont understand.
I can process local .m4a files with no problem, but not with absolute path. But .mp3 works fine with absolute path.

Maybe is there something related to my project?
I changed my hugging face cache to a folder in /Users/lmonteir/.HandySpeechBot/models.
It is a virtual env, created with python3 -m venv . The env is at /Users/lmonteir/Projects/.

I'm just confused, but now I have a workaround, which is nice.

Thank you for the support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants