Skip to content
This repository has been archived by the owner on Jun 20, 2022. It is now read-only.

Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks

License

Notifications You must be signed in to change notification settings

snakers4/russian_stt_text_normalization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Normalization

Russian STT Text Normalization

Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks.

Requirements

  • Python >= 3.6
  • PyTorch >= 1.4 for s2s pipeline
  • tqdm for progress bar
pip install torch
pip install tqdm

Usage

from normalizer import Normalizer

text = 'С 12.01.1943 г. площадь сельсовета — 1785,5 га.'

norm = Normalizer()
result = norm.norm_text(text)
print(result)
>>> С двенадцатого января тысяча девятьсот сорок третьего года площадь сельсовета
>>> — тысяча семьсот восемьдесят пять целых и пять десятых гектара

About

Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages