Brownie

Brownie is a post-processing project for FST-based Automatic Speech Recognition.

Goal

Although there are lots of speech-to-text API now, sometimes we are feel not easy to using it. Because the recognition result is impossible 100% correct especially on we are trying to deploy it in application like make a phone call in personal phone book, specific domain chatbot and so on.
In this project the goal is increasing the recognition accuracy for ASR result more closer to user's application on specific domain Named Entities(NE).

Reference

Contextual Recovery of Out-of-Lattice Named Entities in Automatic Speech Recognition, this link

Usage

The major data format follows Kaldi definition.

Starting from FST

Like Kaldi or others FST-based decoder, the hypothesis conld be think as lattice or FSA/FST.
This usage could see run.py as example.

Starting from text string

If you could only get hypothesis in string type like using public Speech2Text API or other E2E decoder, you could use below method to convert it as FST:

from src.common import read_string_as_fst
from src.utils import sym2int, DataIO

hyp_list = DataIO().read_file_to_list("test.txt")
input_word_table = DataIO().read_word_table("words.txt")
for _, hyp in enumerate(hyp_list):
    hyp_int = sym2int(" ".join(hyp), input_word_table)
    fst = read_string_as_fst(hyp_int)

How to generate personal grammar?

Define custom vocabularies

If you have been used Amazon Transcribe you must be familier this setting.

Noted the first line (Phrase, SoundsLike, IPA, DisplayAs) and phrase column must be given and seperate each column by comma.
Custom vocabularies table format is:

Phrase,SoundsLike,IPA,DisplayAs
世界博覽會,shi4-jie4-bo4-lan3-hui4,,世博會
一個巨星的誕生,,,一個巨星的誕生
EMMA ROSE,,EH1 M AH0 R OW1 Z,emmarose,

And variables definition is:

Phrase: Hypothesis string could be a sequence of error pattern or a word
SoundsLike: Mandarin phrase spelled by Mandarin syllables. Later will handle English phrase
IPA: IPA phoneme representation
DisplayAs:  Defines how the word or phrase looks

Define user grammar

Grammar could point out where the entity appear to avoid unnecessary replace.
One sample is how to enhance voice assistants accuracy to make a phone call situation. In this case anchor word is "CALL" and defined NE would appear between anchor and sentence end. User grammar format is :

<SIGMA_STAR> CALL <CONTACT> <SIGMA_STAR> </CONTACT> </s>

And variables definition is:

<SIGMA_STAR>:   Special tag for filier words
<> and </>:    User specific tag pair
<s>:    Special tag for sentence start
</s>:   Special tag for sentence end

Installation requirements

This project used the OpenFst and Pynini toolkit.

OpenFst

wget http://www.openfst.org/twiki/pub/FST/FstDownload/openfst-1.7.9.tar.gz
tar zxvf openfst-1.7.9.tar.gz && cd openfst-1.7.9
sudo ./configure --enable-grm
sudo make
sudo make install

Pynini

conda install -c conda-forge pynini=2.1.0

Others

Handling Mandarin language will need a Chonese tokenizer. Here we has a high level interface named Tokenizer() to use two different backend Jieba and HanLP to segment OOV words.

pip install -r requirments.txt

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
conf		conf
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py
sample.fst		sample.fst

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conf

conf

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

run.py

run.py

sample.fst

sample.fst

Repository files navigation

Brownie

Goal

Reference

Usage

Starting from FST

Starting from text string

How to generate personal grammar?

Define custom vocabularies

Define user grammar

Installation requirements

OpenFst

Pynini

Others

About

Releases

Packages

Languages

License

mcw519/Brownie

Folders and files

Latest commit

History

Repository files navigation

Brownie

Goal

Reference

Usage

Starting from FST

Starting from text string

How to generate personal grammar?

Define custom vocabularies

Define user grammar

Installation requirements

OpenFst

Pynini

Others

About

Topics

Resources

License

Stars

Watchers

Forks

Languages