Skip to content
Eric Riebling edited this page Feb 4, 2019 · 15 revisions

This is the Wiki that goes with the Speech Recognition Virtual Kitchen's Github repository. If you are looking for ready-to-run VMs (or if this site looks too messy), go to the main website. This repository contains the corresponding source code (if available). Typically, we provision with Vagrant, but some machines are also (or only) available as Docker containers. You can also find ready-to-run VMs on AWS.

Documentation is available for Vagrant, Docker, and AWS.

Available Machines

Using the code in these repositories, you can build a VM (or a container) locally. You can then easily make changes, or build the VM so that it runs optimally on your own cluster or various cloud providers.

  • Prix Fixe (A very basic VM for first time speech recognition learning)
  • ivw (Interaction in Virtual Worlds)
  • ivw3 Builds on IVW; dialog system for Harry Potter-like potion making using a stack
  • Mario XFCE A very simple demo of launching an XFCE panel in a window, to bring up other X11 windowed applications from within a Vagrant virtual machine.
  • Mario Base Box This is the Mario Vagrant base box, which is a vanilla Ubuntu 14.04 install configured with audio and X11 forwarding, and nothing else
  • Kaldi Base Box Starting with the Mario Vagrant base box, this code installs Kaldi 'from scratch' [//]: # (* Arabic Transcriber) [//]: # (WIKI)
  • tedlium a fully open-source Kaldi training setup which also has a graphical user interface with error analysis for speech developers.
  • eesen-transcriber A system to transcribe audio from most audio and video formats - into subtitles and/or raw text. Based on kaldi-offline-transcriber and eesen

Add-On Experiments

Some repositories do not contain code to re-build a VM, but contain "add-on" stuff that you can install into an existing VM in order to install its functionality.