Skip to content

dkaushik96/falkon

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

falkon

Towards an ecosystem of tasks related to Language Technologies

This repo combines design principles from Kaldi(https://github.com/kaldi-asr/kaldi) and festvox(https://github.com/festvox/festvox) but has quirks of its own.

The goal is to make it easier to build and compare against baselines across tasks.

Tasks:

  • Self Assessed Affect Detection from Speech
  • Image Captioning
  • Automatic Speech Recognition
  • Atypical Emotion Recognition from Speech
  • Cry Classification
  • Speech Synthesis
  • Machine Translation
  • Visual Question Answering
  • Voice Conversion
  • Spoofing Detection from Speech
  • Sentiment Analysis
  • Toxicity Detection from Text
  • Multi Target Speaker Detection and Identification

Layers -> Modules -> Models

For example,

Conv1d++ class is a layer that enables temporal convolutions during eval.
ResidualDilatedCausalConv1d is a module built on top of Conv1d++
Wavenet is a model built on top of ResidualDilatedCausalConv1d

LSTM++ class is a layer that enables learning initial hidden states based on condition.
VariationalEncoderDecoder is a module built on top of LSTM++
ImageCaptioning is a model built on top of VariationalEncoderDecoder

src.nn hosts all of these.

The directoy 'tasks' contains the individual tasks. Updated a sample speech task. The timeline on this repo looks like end of Summer 2018.

About

Towards an ecosystem of tasks related to Language Technologies

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 83.6%
  • Shell 16.4%