Kathabhidhana: Audio recording for Odia Wiktionary

What comes to your mind when you think of a dictionary? A huge boring book that you never wanted to open? Or may be a mobile app that you open while struggling with understanding a few new words in any write-up? But think of a dictionary that also pronounces the words rather than just showing them in [International Phonetic Alphabet] (https://en.wikipedia.org/wiki/International_Phonetic_Alphabet) (IPA).

Wikipedia has a sister project called Wiktionary. And it's multilingual. Though it's easier to search any word in Google with a suffix "meaning" to hear the pronunciation of the word, there are not many [open-licensed] (https://en.wikipedia.org/wiki/Open_Content_License) audio recordings that you can hear, download, and even use for your own work. Kathabhidhana is a community project led by [Subhashish Panigrahi] (http://meta.wikimedia.org/wiki/User:Psubhashish/) to create an open source solution for recording large chunks of words and then uploading them under open licenses so that they can useful for projects like Wiktionary. The project draws its inspiration largely from another open source [software] (https://github.com/tshrinivasan/voice-recorder-for-tawictionary) created by by [Shrinivasan T] (https://github.com/tshrinivasan).

Currently several Odia-language words are being [recorded] (https://commons.wikimedia.org/wiki/Category:Odia_pronunciation), uploaded on [Wikimedia Commons] (https://commons.wikimedia.org), and are being used in [Odia Wiktionary] (https://or.wiktionary.org), the Odia-language version of Wiktionary. The purpose of creating this audio library is multi-folded—apart from using them on Wiktionary, we also aim at using them for any [Natural Language Processing] (http://en.wikipedia.org/wiki/Natural Language Processing) (NLP) project (and you are free to use [with attribution] (https://github.com/OdiaWikimedia/Kathabhidhana/blob/master/README.md#attribution) any resource available in this page).

[Tutorial to learn and use it] (http://www.youtube.com/watch?v=zd4KNbNX4_Y) (in English)
[Audio recording of words that were made using this tool] (https://commons.wikimedia.org/wiki/Category:Audio_files_created_using_Kathabhidhana)
An idea format needed for uploading multiple file

An Odia version of the resources and tutorial is available here. We are currently working on building more tutorials so that you can learn more about bettering your home studio setup—assuming you don't have access to a fancy recording studio but if you have please do leverage that, tips and tricks about batch renaming files, cleaning up using open source tools like [Audacity] (http://www.audacityteam.org/download/), setting up files for batch upload on Wikimedia Commons, etc. So stay tuned.

Prerequisites

Linux or macOS
Linux running in a virtual machine

or

For Kathabhidhana for iOS

iOS
An app called Workflow

How to execute?

(you need to run the command in Linux or Mac, or Linux in a [virtual machine] (https://en.wikipedia.org/wiki/Virtual_machine) if you're on Windows) [Read in Odia] (https://goo.gl/hqXeG3)

Fill the words you want to recoed in a textfile named "file"
run the below command

python voice-record.py 2> err

this will record the sounds in ogg and wav formats.

To upload all the ogg files to Wikimedia Commons
a) Edit the file mediawiki-uploader.py Fill the commons api url, username and password
b) run the below command python mediawiki-uploader.py

Attribution

Project led by Subhashish Panigrahi. All the media and text content are available under a [CC-BY-SA 4.0] (https://creativecommons.org/licenses/by-sa/4.0/) license
All the software component is licensed under [GNU General Public License (GPL) version 3] (https://www.gnu.org/licenses/gpl.html)
This project and part of the documentation are based on the [Voice recorder for Tawiktionary] (https://github.com/tshrinivasan/voice-recorder-for-tawictionary) project created by [Shrinivasan T] (https://github.com/tshrinivasan) (Please attribute Shrinivasan T if you're making a derivative of the software)***

Other resources

[Pronuncify by Asaf Bartov] (https://github.com/abartov/pronuncify), a similar command line tool for both Linux/Mac and Windows.
[Sample Wikimedia Commons description] (https://docs.google.com/spreadsheets/d/1Vh08Dd6V743Q58ceCnNLc9BASaZQMAsGu1BOaa_dMQQ/pub?output=ods) that you're free to download, replace with your details and use.

Shoutouts

Rezwan. "A New Audio Uploading Tool for Crowdsourced Wiktionary Project in Odia Language". Global Voices (February 13, 2017)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Kathabhidhana: Audio recording for Odia Wiktionary

Prerequisites

How to execute?

Attribution

Other resources

Shoutouts

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
Kathabhidhana for iOS		Kathabhidhana for iOS
README.md		README.md
_config.yml		_config.yml
completed_words		completed_words
err		err
file		file
mediawiki-uploader.py		mediawiki-uploader.py
record.py		record.py
voice-record.py		voice-record.py

pattaprateek/Kathabhidhana

Folders and files

Latest commit

History

Repository files navigation

Kathabhidhana: Audio recording for Odia Wiktionary

Prerequisites

How to execute?

Attribution

Other resources

Shoutouts

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages