SincNet in Tensorflow

An Implementation of SincNet using Tenorflow 2.x.

Models are converted from original torch networks.
The main implementation of the sinc_conv layer is non-optimal. Instead of using loops in the call section, we used matrix multiplication and a few programming tricks that allow the hardware to run more efficiently (25 times faster).

SincNet

SincNet is a neural architecture for processing raw audio samples. It is a novel Convolutional Neural Network (CNN) that encourages the first convolutional layer to discover more meaningful filters. SincNet is based on parametrized sinc functions, which implement band-pass filters. Arxiv

Install

$ pip install sincnet-tensorflow

Usage

Demo

Training on a dummy database to check for error-free execution

A layer for Keras Functional

import tensorflow as tf
from tensorflow.keras.layers import Dense, Conv1D
from tensorflow.keras.layers import LeakyReLU, BatchNormalization, Flatten, MaxPooling1D, Input

from sincnet_tensorflow import SincConv1D, LayerNorm


out_dim = 50 #number of classes

sinc_layer = SincConv1D(N_filt=64,
                        Filt_dim=129,
                        fs=16000,
                        stride=16,
                        padding="SAME")


inputs = Input((32000, 1)) 

x = sinc_layer(inputs)
x = LayerNorm()(x)

x = LeakyReLU(alpha=0.2)(x)
x = MaxPooling1D(pool_size=2)(x)


x = Conv1D(64, 3, strides=1, padding='valid')(x)
x = BatchNormalization(momentum=0.05)(x)
x = LeakyReLU(alpha=0.2)(x)
x = MaxPooling1D(pool_size=2)(x)

x = Conv1D(64, 3, strides=1, padding='valid')(x)
x = BatchNormalization(momentum=0.05)(x)
x = LeakyReLU(alpha=0.2)(x)
x = MaxPooling1D(pool_size=2)(x)

x = Conv1D(128, 3, strides=1, padding='valid')(x)
x = BatchNormalization(momentum=0.05)(x)
x = LeakyReLU(alpha=0.2)(x)
x = MaxPooling1D(pool_size=2)(x)

x = Conv1D(128, 3, strides=1, padding='valid')(x)
x = BatchNormalization(momentum=0.05)(x)
x = LeakyReLU(alpha=0.2)(x)
x = MaxPooling1D(pool_size=2)(x)

x = Flatten()(x)

x = Dense(256)(x)
x = BatchNormalization(momentum=0.05, epsilon=1e-5)(x)
x = LeakyReLU(alpha=0.2)(x)

x = Dense(256)(x)
x = BatchNormalization(momentum=0.05, epsilon=1e-5)(x)
x = LeakyReLU(alpha=0.2)(x)

prediction = Dense(out_dim, activation='softmax')(x)
model = tf.keras.models.Model(inputs=inputs, outputs=prediction)

model.summary()

References

@inproceedings{ravanelli2018speaker,
  title={Speaker recognition from raw waveform with sincnet},
  author={Ravanelli, Mirco and Bengio, Yoshua},
  booktitle={2018 IEEE Spoken Language Technology Workshop (SLT)},
  pages={1021--1028},
  year={2018},
  organization={IEEE}
}

@misc{SincNet,
    title   = {SincNet}, 
    author  = {Mirco Ravanelli (mravanelli)},
    year    = {2018},
    url  = {https://github.com/mravanelli/SincNet},
    publisher = {Github},
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
demo		demo
sincnet_tensorflow		sincnet_tensorflow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

demo

demo

sincnet_tensorflow

sincnet_tensorflow

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

setup.cfg

setup.cfg

setup.py

setup.py

Repository files navigation

SincNet in Tensorflow

SincNet

Install

Usage

Demo

A layer for Keras Functional

References

About

Releases

Packages

Languages

License

AryaAftab/sincnet-tensorflow

Folders and files

Latest commit

History

Repository files navigation

SincNet in Tensorflow

SincNet

Install

Usage

Demo

A layer for Keras Functional

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages