Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I keep getting "Try again the sound of the audio was not clear" #20

Open
ClarabelleCheng-Yue opened this issue May 28, 2021 · 2 comments

Comments

@ClarabelleCheng-Yue
Copy link

ClarabelleCheng-Yue commented May 28, 2021

Am I missing something?
I am trying this on my Mac:

import myspsolution as mysp

p="test_audio.wav" # Audio File title
c=r"/Users/c.chengy/Desktop/Projects/my-voice-analysis/my-voice-analysis/audio_test"

mysp.myspgend(p,c)

When I look under Get Info of the audio file, I see:
Duration: 00:11
Sample rate: 44.1kHz
Bits per sample: 16

@tomtom1103
Copy link

Hi, I'm also on a mac and I've tinkered aroud for a while, and I think I've figured it out.

first, you want to import using
mysp=import("my-voice-analysis")

instead of import myspsolution as mysp.

make sure you have the file myspsolution.praat in the same directory as the python script. if you don't, download it from the repository.

I'm not sure if 44.1kHz works since the library states that the audio file has to be 44kHz, but just to be safe, try this:

y, s = librosa.load("/Users/c.chengy/Desktop/Projects/my-voice-analysis/my-voice-analysis/audio_test", sr=48000)

this uses the librosa library to load the audio data into the variable y as a numpy array, and saves the value 48000 into the variabe s.

sf.write('audio_test_1', y, s, "PCM_24")

this uses the soundfile library to save a new .wav file called audio_test_1 with the data you called using the librosa function. the argument y is the audio data, and s is the sample rate of 48000. PCM_24 states the saved audio will have a bit depth of 24.

Here's the important part:
p = "test_audio_1"
c = r"/Users/c.chengy/Desktop/Projects/my-voice-analysis/my-voice-analysis"

p is the name of the file the script looks for. remove the .wav.

c is the path the script looks the audio data from. remove the /audio_test.

then try:
mysp.myspgend(p,c)

if Im (hopefully) right, this should work.

I think a lot of people are confused because of how the script looks for paths in the local. usually most python libraries look for files another step deep, but not my-voice-analysis.

Hope this helps!

@GianniKoch
Copy link

GianniKoch commented May 2, 2023

Hey there, I had the same problem but I got something working. Including file conversion to work with the my-voice-analysis requirements. Leaving it here so I might save future people some time. 😅
The only thing that needs to be changed is the path variable and to start analyzing you can start by using the function analyze_audio_file.

import contextlib
import io
import os
import re

import librosa
import soundfile as sf

mysp = __import__("my-voice-analysis")
path = r"C:\calls" # Path to where your audio file are
temp_path = r"C:\temp" # IMPORTANT! drop the "myspsolution.praat" in this folder and this folder path and name does not have spaces.
temp_name = "temp.wav" # file name of the temp file for conversion.


def analyze_audio_file(audio_file):
    convert_audio_file(audio_file, path)
    with io.StringIO() as buf, contextlib.redirect_stdout(buf):
        mysp.mysptotal(temp_name[:-4], temp_path)
        captured_output = buf.getvalue()

        numbers = [float(num) for num in re.findall(r"\d+\.\d+|\d+", captured_output) if num != "0"]
        # remove temp file
        os.remove(fr"{temp_path}/{temp_name}")

        if len(numbers) != 16:
            return numbers
        return {
            "number_of_syllables": numbers[0],
            "number_of_pauses": numbers[1],
            "rate_of_speech": numbers[2],
            "articulation_rate": numbers[3],
            "speaking_duration": numbers[4],
            "original_duration": numbers[5],
            "balance": numbers[6],
            "f0_mean": numbers[7],
            "f0_std": numbers[8],
            "f0_median": numbers[9],
            "f0_min": numbers[10],
            "f0_max": numbers[11],
            "f0_quantile25": numbers[12],
            "f0_quan75": numbers[13],
        }


def convert_audio_file(input_file, path):
    y, s = librosa.load(f"{path}/{input_file}", sr=44100)

    if len(y) % 2 == 1:
        y = y[:-1]

    y = y * 32767 / max(abs(y))
    y = y.astype('int16')

    sf.write(f"{temp_path}/{temp_name}", y, s, "PCM_24")
path = r"C:\my-audio-files-path" # Path to where your audio file(s) is(are)

analyze_audio_file("my-audio-file.wav") # Name of the audio file in the folder of the path variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants