Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely slow when evaluating #540

Open
nobel861017 opened this issue Aug 8, 2021 · 8 comments
Open

Extremely slow when evaluating #540

nobel861017 opened this issue Aug 8, 2021 · 8 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@nobel861017
Copy link
Contributor

Hi, I am running stage 3 of egs/librimix/ConvTasNet/run.sh.
I used --compute_wer 1 --eval_mode max to evaluate WER.
However, it is running extremely slow.

2%|█▉                                                                                                  | 58/3000 [46:02<29:01:05, 35.51s/it

It takes more than one day to complete.
I checked with nvidia-smi, and it was computing with GPU. However, I think only the separation process is running with GPU. I looked through the code eval.py and found out that numpy arrays are fed to the wer_tracker. So I think that for the asr part, it is evaluating in CPU mode. Is there any reason this can't be calculated with GPUs?

By the way, I see that eval.py is evaluating with the "Shinji Watanabe/librispeech_asr_train_asr_transformer_e18_raw_bpe_sp_valid.acc.best" asr model. Is it possible to switch to other kinds of asr models by modifying line 52?

Thanks

@nobel861017 nobel861017 added bug Something isn't working help wanted Extra attention is needed labels Aug 8, 2021
@JusperLee
Copy link

I don't think it's buggy because the mir_eval library is used when calculating separated performance metrics such as SNR. This library can have complex calculations and therefore can be time-consuming. If you only need WER metrics, you can ignore calculating separated performance metrics such as SNR.

@nobel861017
Copy link
Contributor Author

Thanks for your reply, but I still hope that the asr model forwards with GPU calculation.

@nobel861017
Copy link
Contributor Author

I left COMPUTE_METRICS as an empty list. However, it is still running so slow.

0%|                                                                                                     | 2/3000 [03:17<69:49:42, 83.85s/it]

@nobel861017
Copy link
Contributor Author

nobel861017 commented Aug 9, 2021

I changed line 231 in metrics.py into
self.asr_model = Speech2Text(**d.download_and_unpack(model_name), device='cuda') and added wav = torch.from_numpy(wav).cuda() to the predict_hypothesis function between line 344 and 345.
It is now able to calculate with GPU.

@mpariente
Copy link
Collaborator

Yes, ASR is very long on CPU. I don't remember why we didn't run it on GPU in the first place, maybe memory issues? Can't remember. Maybe @popcornell or @JorisCos would remember?

@popcornell
Copy link
Collaborator

At the time full decoding on GPU was not implemented with batched inputs. I recall the reason was something like this, we had problems running on GPU. Looks like now it runs smooth on GPU, thanks to the ESPNet gang and thank you for trying this.
If you have time add an argument like use_gpu and submit a PR it would be great

@nobel861017
Copy link
Contributor Author

nobel861017 commented Aug 10, 2021

@popcornell @mpariente Thanks for your replies.
I think the most urgent thing we need now is batch processing on asr. I think this requires a lot of modification in eval.py. I'm trying this. If you guys have any ideas or instructions on how to do this, please let me know.

@nobel861017
Copy link
Contributor Author

@popcornell I have sent a PR allowing asr to compute with GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants