Skip to content

End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

License

Notifications You must be signed in to change notification settings

upskyy/Automatic-Speech-Recognition-Models

Repository files navigation

Automatic Speech Recognition Models

Codestyle Status Codestyle Status Codestyle Status Codestyle Status

End-to-end (E2E) automatic speech recognition (ASR) models were implemented with Pytorch.
We used KsponSpeech dataset for training and Hydra to control all the training configurations.

Installation

pip install -e .   

Preparation

You can download dataset at AI-Hub. Anyone can download this dataset just by applying. Then, the KsponSpeech dataset was preprocessed through here.

Usage

- Training

You can choose from several models and training options.

  • Deep Speech2 Training
$ python main.py \
    model=deepspeech2 \
    train=deepspeech2_train \
    train.dataset_path=$DATASET_PATH \
    train.audio_path=$AUDIO_PATH \
    train.label_path=$LABEL_PATH
  • Listen, Attend and Spell Training
$ python main.py \
    model=las train=las_train \
    train.dataset_path=$DATASET_PATH \
    train.audio_path=$AUDIO_PATH \
    train.label_path=$LABEL_PATH
  • Joint CTC-Attention Listen, Attend and Spell Training
$ python main.py \
    model=joint_ctc_attention_las \
    train=las_train \
    train.dataset_path=$DATASET_PATH \
    train.audio_path=$AUDIO_PATH \
    train.label_path=$LABEL_PATH

- Evaluation

$ python eval.py \
    eval.dataset_path=$DATASET_PATH \
    eval.audio_path=$AUDIO_PATH \
    eval.label_path=$LABEL_PATH \
    eval.model_path=$MODEL_PATH

Reference

Author

License

MIT License

Copyright (c) 2021 Sangchun Ha

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

About

End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages