Skip to content

An AI web application that uses Google API to transcribe text to speech in 6 different languages and speech to text to extract meaningful features from audio speech like speech sentiment, NER and visualizing audio signals.

chemicoPy/Speech-Text-Analytic-app

Repository files navigation

forthebadge made-with-python

GitHub contributors GitHub closed pull requests GitHub repo size GitHub stars

Project Overview

This is an AI web application that offers transcription of text to speech and speech to text using Google pretrained model. The goal is to extract insight from audio speech in the form of text

App Demo

app.demo.mp4

Inspiration

The most common part of the Natural Language Processing is the written text, which is hugely available and can come in the form of documents, scraped data from websites etc. Many firms and organization rely on the processing of these collected data to derive insights to better serve their customers. On the other hand, speech is another basic form of human language that is quite difficult to process and achieve state of the art performance owing to it's dependency on several factors. There are many organization for instance, the Telecommunication industries that generate audio files from their customers in the form of complaints or expression regarding a particular product or service. The major goal of this project is to leverage google API to transform audio speech to text and apply the same processing steps like every other text document to extract insights like specific key words from the speech and analyzing sentiment in the speech. Another part of this project featured using Google translate to recognize the three major Nigerian native languages. However, google does not support this feature yet, but recognizes Nigerian accent which was included in the app.

Further Improvement

  • Developing a hate speech detecting algorithm to classify hate speech
  • Training Neural Network model to classify raw audio files into Sad, Happy, Disgust, and Fearful

About

An AI web application that uses Google API to transcribe text to speech in 6 different languages and speech to text to extract meaningful features from audio speech like speech sentiment, NER and visualizing audio signals.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published