Building wav2letter++

Dependencies

1. flashlight

wav2letter++ uses flashlight as its core ML backend.

Please follow the provided install procedures.
wav2letter++ requires flashlight built with distributed training enabled (default).

2. KenLM

wav2letter++ uses KenLM to allow beam-search decoding with an n-gram language model.

At least one of LZMA, BZip2, or Z is required for LM compression with KenLM.
It is highly recommended to build KenLM with position-independent code (-fPIC) enabled, to enable python compatibility.
After installing, run export KENLM_ROOT_DIR=... so that wav2letter++ can find it. This is needed because KenLM doesn't support a make install step.

Example build commands on Ubuntu:

sudo apt-get install liblzma-dev libbz2-dev libzstd-dev
git clone https://github.com/kpu/kenlm.git
cd kenlm
mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DKENLM_MAX_ORDER=20 -DCMAKE_POSITION_INDEPENDENT_CODE=ON
make -j16
# don't forget to export KENLM_ROOT_DIR

3. Additional Dependencies

The following additional packages are required:

Any CBLAS library, i.e. at least one of these:
- ATLAS
- OpenBLAS
- Accelerate
- Intel MKL (used preferentially if present)
FFTW3
libsndfile
- Should be built with Ogg, Vorbis, and FLAC libraries.
gflags
glog

Example (Ubuntu). The following command will install all the above packages:

apt-get install libsndfile1-dev libopenblas-dev libfftw3-dev libgflags-dev libgoogle-glog-dev

4. Optional Notes

The following dependencies should be already installed for flashlight:

A C++ compiler with good C++11 support (e.g. g++ >= 4.8)
cmake >= 3.5.1, and make
CUDA >= 9.2, only if using CUDA backend

The following dependencies are automatically downloaded and built by cmake:

gtest and gmock 1.8.1, only if building tests
CUB 1.8.0, only if using CUDA backend

The following dependencies are optional:

OpenMP, if present, will be used for better performance.

Build Options

Option	Configuration	Default Value
W2L_BUILD_LIBRARIES_ONLY	ON, OFF	OFF
W2L_LIBRARIES_USE_CUDA	ON, OFF	ON
W2L_LIBRARIES_USE_KENLM	ON, OFF	ON
W2L_LIBRARIES_USE_MKL	ON, OFF	ON
W2L_BUILD_FOR_PYTHON	ON, OFF	OFF
W2L_BUILD_TESTS	ON, OFF	ON
W2L_BUILD_EXAMPLES	ON, OFF	ON
W2L_BUILD_EXPERIMENTAL	ON, OFF	OFF
W2L_BUILD_RECIPES	ON, OFF	ON
W2L_BUILD_SCRIPTS	ON, OFF	OFF
CMAKE_BUILD_TYPE		Debug

General Build Instructions

First, clone the repository:

git clone --recursive https://github.com/facebookresearch/wav2letter.git

and follow the build instructions for your specific OS.

There is no install procedure currently supported for wav2letter++. Building produces three binaries in the build directory:

Train: given a dataset of input audio and corresponding transcriptions in sub-word units (graphemes, phonemes, etc), trains the acoustic model.
Test: performs inference on a given dataset with an acoustic model.
Decode: given an acoustic model/pre-computed network emissions and a language model, computes the most likely sequence of words for a given dataset.

Building on Linux

wav2letter++ has been tested on many Linux distributions including Ubuntu, Debian, CentOS, Amazon Linux, and RHEL.

Assuming you have ArrayFire, flashlight, libsndfile, and KenLM built/installed, install the below dependencies with apt (or your distribution's package manager):

sudo apt-get update
sudo apt-get install \
    # Audio encoding libs for libsndfile \
    libasound2-dev \
    libflac-dev \
    libogg-dev \
    libtool \
    libvorbis-dev \
    # FFTW for Fourier transforms \
    libfftw3-dev \
    # Compression libraries for KenLM \
    zlib1g-dev \
    libbz2-dev \
    liblzma-dev \
    libboost-all-dev \
    # gflags \
    libgflags-dev \
    libgflags2v5 \
    # glog \
    libgoogle-glog-dev \
    libgoogle-glog0v5 \

MKL and KenLM aren't easily discovered by CMake by default; export environment variables to make sure they're found. On most Linux-based systems, MKL is installed in /opt/intel/mkl. Since KenLM doesn't support an install step, after building KenLM, point CMake to wherever you downloaded and built KenLM:

export MKLROOT=/opt/intel/mkl # or path to MKL
export KENLM_ROOT_DIR=[path to KenLM]

Once you've downloaded wav2letter++ and built and installed the required dependencies:

# in your wav2letter++ directory
mkdir -p build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j4 # (or any number of threads)

Building/Running with Docker

wav2letter++ and its dependencies can also be built with the provided Dockerfile. Both CUDA and CPU backends are supported with Docker

To build wav2letter++ with Docker:

Install Docker and, if using the CUDA backend, nvidia-docker

Run the docker image with CUDA/CPU backend in a new container:

# with CUDA backend
sudo docker run --runtime=nvidia --rm -itd --ipc=host --name w2l wav2letter/wav2letter:cuda-latest
# or with CPU backend
sudo docker run --rm -itd --ipc=host --name w2l wav2letter/wav2letter:cpu-latest
sudo docker exec -it w2l bash

To run tests inside a container
```
cd /root/wav2letter/build && make test
```
Build Docker image from the source (using --no-cache will provide the latest version of flashlight inside the image if you have built the image previously for earlier versions of wav2letter):
```
git clone --recursive https://github.com/facebookresearch/wav2letter.git
cd wav2letter
# for CUDA backend
sudo docker build --no-cache -f ./Dockerfile-CUDA -t wav2letter .
# for CPU backend
sudo docker build --no-cache -f ./Dockerfile-CPU -t wav2letter .
```
For logging during training/testing/decoding inside a container, use the --logtostderr=1 --minloglevel=0 flag.

Building Python bindings

Dependencies

We require python >= 3.6 with the following packages installed:

packaging
torch

Anaconda makes this easy. There are plenty of tutorials on how to set this up.

Aside from the above, the dependencies for Python bindings are a strict subset of the dependencies for the main wav2letter++ build. So if you already have the dependencies to build wav2letter++, you're all set to build python bindings as well.

The following dependencies are required to build python bindings:

KenLM
ATLAS or OpenBLAS
FFTW3
cmake >= 3.5.1, and make
CUDA >= 9.2

Please refer to the previous sections for details on how to install the above dependencies.

The following dependencies are not required to build python bindings:

flashlight
libsndfile
gflags
glog

Build Instructions

Once the dependencies are satisfied, simply run from wav2letter root:

cd bindings/python
pip install -e .

Note that if you encounter errors, you'll probably have to rm -rf build before retrying the install.

Advanced Options

The following environment variables can be used to control various options:

USE_CUDA=0 removes the CUDA dependency, but you won't be able to use ASG criterion with CUDA tensors.
USE_KENLM=0 removes the KenLM dependency, but you won't be able to use the decoder unless you write C++ pybind11 bindings for your own LM.
USE_MKL=1 will use Intel MKL for featurization but this may cause dynamic loading conflicts.
If you do not have torch, you'll only have a raw pointer interface to ASG criterion instead of class ASGLoss(torch.nn.Module).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

installation.md

installation.md

Building wav2letter++

Dependencies

1. flashlight

2. KenLM

3. Additional Dependencies

4. Optional Notes

Build Options

General Build Instructions

Building on Linux

Building/Running with Docker

Building Python bindings

Dependencies

Build Instructions

Advanced Options

Files

installation.md

Latest commit

History

installation.md

File metadata and controls

Building wav2letter++

Dependencies

1. flashlight

2. KenLM

3. Additional Dependencies

4. Optional Notes

Build Options

General Build Instructions

Building on Linux

Building/Running with Docker

Building Python bindings

Dependencies

Build Instructions

Advanced Options