GitHub - pariajm/deep-disfluency-detector: Disfluency Detection using Auto-Correlational Neural Networks

Disfluency Detection using Auto-Correlational Neural Networks (ACNN)

This is the implementation of Auto-Correlational Neural Networks (ACNN) proposed for disfluency detection from speech transcripts, based on this paper from EMNLP 2018.

Disfluency refers to any interruptions in the normal flow of speech, including false starts, corrections, repetitions and filled pauses. The basic pattern of disfluency contains three main parts reparandum, interregnum and repair. As illustrated below, the reparandum "to Boston" is the part of the utterance that is replaced, the interregnum "uh I mean" is an optional part of a disfluent structure, and the repair "to Denver" replaces the reparandum. The fluent version is obtained by removing reparandum and interregnum words although disfluency detection models mainly deal with identifying and removing reparanda. The repair (e.g. "to Denver") frequently seems to be a "rough copy" of the reparandum (e.g., to Boston) -- i.e. they incorporate the same or very similar words in roughly the same word order. This similarity is strong evidence of a disfluency that can help the model detect reparanda.

ACNN Model

CNNs and RNNs are surprisingly poor at capturing the "rough copy" dependencies; as a result, their performance heavily depends on hand-crafted pattern-match features. Auto-Correlational Neural Network (ACNN) is a novel neural network that generalises CNN and is able to learn the "rough copies" without requiring any manual feature engineering. The ACNN model only uses whole-word inputs; however, it is competitive with lots of complex models in the literature which rely on hand-crafted features, additional information sources such as partial-word features (which would not be available in a realistic ASR application), or external resources such as dependency parsers and language models.

Requirements

Python 3
Tensorflow > 0.12
Numpy

$ git clone https://github.com/pariajm/deep-disfluency-detector
$ cd deep-disfluency-detector

Data

We split the Switchboard corpus into training, dev and test set as follows: training data consists of all *sw[23]\*.dff* files, dev training consists of all *sw4[5-9]\*.dff* files and test data consists of all *sw4[0-1]\*.dff* files. We lower-case all text and remove all partial words (e.g. "neu-") and punctuations from the data. The format of input and output files is one sentence per line, where each word in the input sentence has a corresponding label in the output file (labels are either "F" or "E" to denote fluent or disfluent words). Since Switchboard Corpus is not open-source, we cannot release the data split that we use to train the ACNN model. We instead provide some sample data in `./sample_data`.

Training

To train a new ACNN model from scratch:

$ python3 train.py --data_path=/path/to/train_and_test_files --checkpoint_dir=/dir/to/save/checkpoints_and_summaries

Prediction

To use the trained ACNN model to predict disfluency labels for your own data:

$ cd model/checkpoints
$ wget https://github.com/pariajm/deep-disfluency-detection/releases/download/v1/model-84893.data-00000-of-00001
$ wget https://github.com/pariajm/deep-disfluency-detection/releases/download/v1/model-84893.index
$ wget https://github.com/pariajm/deep-disfluency-detection/releases/download/v1/model-84893.meta
$ cd ../..
$ python3 prediction.py --input_path=/path/to/input/file --checkpoint_dir=./model --output_path=/path/to/output/file

Citation

@InProceedings{jamshidlou2018,
  author = 	{Jamshid Lou, Paria and Anderson, Peter and Johnson, Mark},
  title = 	{Disfluency Detection using Auto-Correlational Neural Networks},
  booktitle = 	{Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP2018)},
  year = 	{2018},
  pages = 	{4610--4619},
  address = 	{Brussels, Belgium},
  publisher =   {Association for Computational Linguistics},
  url       =   {https://www.aclweb.org/anthology/D18-1490.pdf}
}

Credits

The baseline CNN code is a modified version of Denny's code.

Contact

Paria Jamshid Lou paria.jamshid-lou@hdr.mq.edu.au

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
img		img
model		model
sample_data		sample_data
README.md		README.md
acnn.py		acnn.py
prediction.py		prediction.py
reader.py		reader.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

img

img

model

model

sample_data

sample_data

README.md

README.md

acnn.py

acnn.py

prediction.py

prediction.py

reader.py

reader.py

train.py

train.py

Repository files navigation

Disfluency Detection using Auto-Correlational Neural Networks (ACNN)

Contents

Basic Overview

Task

ACNN Model

Requirements

Data

Training

Prediction

Citation

Credits

Contact

About

Releases 1

Packages

Contributors 2

Languages

pariajm/deep-disfluency-detector

Folders and files

Latest commit

History

Repository files navigation

Contents

Basic Overview

Task

ACNN Model

Requirements

Data

Training

Prediction

Citation

Credits

Contact

About

Topics

Resources

Stars

Watchers

Forks

Languages