Skip to content

nhattruongpham/torchvggish-gpu

 
 

Repository files navigation

torchvggish-gpu

Re-Implementation of Google Research's VGGish model used for extracting audio features using Pytorch with GPU support.

A re-implementation of VGGish[1], a feature embedding frontend for audio classification models, using Pytorch with GPU support. This code is fully based on torchvggish[2].

Usage

import torch
from torchvggish_gpu import vggish
import vggish_input

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# For GPU support, the device must be cuda

embedding_model = vggish()
embedding_model.to(device)
embedding_model.eval()
example = vggish_input.wavfile_to_examples("bus_chatter.wav")
example = example.to(device)
audio_embeddings = embedding_model.forward(example)

[1] S. Hershey et al., ‘CNN Architectures for Large-Scale Audio Classification’, in International Conference on Acoustics, Speech and Signal Processing (ICASSP),2017. Available: https://arxiv.org/abs/1609.09430, https://ai.google/research/pubs/pub45611

[2] Harri Taylor et al., ‘Pytorch port of Google Research's VGGish model used for extracting audio features’, v0.1, Sep 27, 2019. Available: https://github.com/harritaylor/torchvggish/releases/tag/v0.1

About

Re-Implementation of Google Research's VGGish model used for extracting audio features using Pytorch with GPU support.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%