Skip to content

sgepigon/pho-diff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pho-diff

Visually compare the phonetic inventories of two languages.

Motivation

When learning a new language, you need to know the new sounds: the ones that aren't a part of your native sound system. I wanted a tool to quickly compare the different sounds of two languages. pho-diff presents two languages in the form of a diff of their IPA charts.

For two languages a and b, if a letter (which represents a distinctive sound) is absent in b's chart but present in a's, it is colored green. Likewise, if a letter is absent in a but present in b, the letter is colored red (see Examples). pho-diff also outputs a Clojure map with additional information (other sounds not on the chart, links to the source URLs, etc.).

pho-diff can be useful for second language learners trying to learn pronunciation. While IPA resources are biased for English speakers, pho-diff can assist with developing resources for ESL: it can identify sounds from someone's first language that may not exist in English or identify sounds in English that do not exist in someone's native language.

pho-diff can diff between two arbitrary languages as long as they're in the Speech Language Archive; paired with other tools, this can be helpful for languages that have fewer common resources or speakers.

Example Use Cases

Use pho-diff to find the new sounds in the target language (colored green) and use the Interactive IPA Chart to hear roughly the sounds that the symbols represent.

An English speaker learning Spanish

lein run "english" "spanish"

A Spanish speaker learning English

lein run "spanish" "english"

A Korean speaker learning Esperanto

lein run "korean" "esperanto"

A German speaker learning Tagalog

lein run "german" "tagalog"

Resources

Installation

Download from https://github.com/sgepigon/pho-diff.

git clone https://github.com/sgepigon/pho-diff.git

Usage

lein run "a" "b"

a and b should be languages from the Speech Accent Archive. See Bugs for caveats.

Examples

lein run "english" "tagalog"
{:keys [:a "english" :b "tagalog"],
 :charts
 {:cons "resources/output/english-tagalog-cons.gif",
  :vowels "resources/output/english-tagalog-vowels.gif"},
 :other-sounds
 {:a #{"labio-velar voiced central approximant [w]" "5 diphthongs"},
  :b #{"labio-velar central approximant [w]"}},
 :sources
 {:a
  "http://accent.gmu.edu/browse_native.php?function=detail&languageid=18",
  :b
  "http://accent.gmu.edu/browse_native.php?function=detail&languageid=64"}}

Consonant Diff Chart Vowel Diff Chart

Bugs

If either languages a or b lack an IPA chart, pho-diff will return nil.

Not all languages listed on the Speech Accent Archive have an inventory chart. Instead, the pages say "Coming soon…" e.g. "malagasy", "yapese", "sotho", "hmong daw", and "tamajeq".

"newari" actually does not say "Coming soon…" but it is missing the charts. "newari" points to "newar" which does say "Coming soon…"

There are some languages that do have IPA charts, but are slightly off, e.g. "yupik", "mandinka", and "swiss german". This misalignment results in ugly diffs:

English-Mandinka Consonant Chart English-Mandinka Vowel Chart

The diff is usable, but still an eyesore.

Built With

License

Copyright © 2017 Santiago Gepigon III

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.

About

Compare the phonetic inventory of two languages.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published