Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问为什么测试模型的result是空白的 #2435

Open
ccyniubi opened this issue Mar 22, 2024 · 3 comments
Open

请问为什么测试模型的result是空白的 #2435

ccyniubi opened this issue Mar 22, 2024 · 3 comments

Comments

@ccyniubi
Copy link

image
image

@MarStarck
Copy link

我也遇到这个问题了

@Mddct
Copy link
Collaborator

Mddct commented Apr 7, 2024

把flac 转成wav试下

@caiyuxi
Copy link

caiyuxi commented May 7, 2024

我也遇到类似的问题。

import torchaudio
import wenet
waveform, sample_rate = torchaudio.load("examples_1995-1836-0002.flac")
torchaudio.save(
    "sound2.wav", waveform, sample_rate,
    encoding="PCM_S")
model = wenet.load_model("english")
result = model.transcribe("sound2.wav")
print(result)

stdout为 {'text': '▁UM', 'confidence': 0.6211525614053677}
音频文件来自https://huggingface.co/spaces/wenet/wenet_demo/tree/main/examples
除了wenet.load_model("english"),其他的预训练模型也有类似的问题。

@pytest.mark.parametrize("model", [
    "gigaspeech_u2pp_conformer_libtorch.tar.gz",
    "librispeech_u2pp_conformer_libtorch.tar.gz",
])
def test_model(model):
    dest = model.split('.')[0]  # aishell_u2pp_conformer_libtorch
    dataset = model.split('_')[0]  # aishell
    if not os.path.exists(dest):
        os.makedirs(dest)
    response = requests.get(
        "https://modelscope.cn/api/v1/datasets/wenet/wenet_pretrained_models/oss/tree"  # noqa
    )
    model_info = next(data for data in response.json()["Data"]
                      if data["Key"] == model)
    model_url = model_info['Url']
    download(model_url, dest=dest, only_child=True)
    model = Model(dest, gpu=-1, beam=5, resample_rate=16000)
    assert dataset in ['gigaspeech', 'librispeech']
    audio_file = "sound2.wav"
    result = model.transcribe(audio_file)
    print(result)

gigaspeech_u2pp_conformer_libtorch.tar.gz": {'confidence': 0.6211525614053677, 'text': '▁UM'}
"librispeech_u2pp_conformer_libtorch.tar.gz": {'confidence': 0.5829262466384643, 'text': ''}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants