Skip to content

AASHISHAG/DeepSpeech-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DeepSpeech-API

Project DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques, based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow project to make the implementation easier.

The intent of this project DeepSpeech-API is to enable the user to access DeepSpeech on a web browser. You can quickly install the dependencies on any platform (Windows/IOS/Linux) and start using it over the Web (Computer/Mobile).

Installing DeepSpeech Python bindings

$ pip3 install deepspeech

Getting the pre-trained model

If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page. Alternatively, you can run the following command to download and unzip the files in your current directory:

wget -O - https://github.com/mozilla/DeepSpeech/releases/download/v0.3.0/deepspeech-0.3.0-models.tar.gz | tar xvfz -

Runnning DeepSpeech-API

[Frontend](https://github.com/AASHISHAG/DeepSpeech-API/tree/master/frontend)
[Backend](https://github.com/AASHISHAG/DeepSpeech-API/tree/master/backend)

alt text