Skip to content

tdaneyko/morphgen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MorphGen

MorphGen is a language-independent morphological generator, i.e. it can generate out all possible fully inflected forms of a lemma. It is based on user-provided rule sets and is thus not specialized on a single language, but can serve as a morphological generator for any language. Though its rules look similar to regular expressions and are interpreted by an automaton-like structure, MorphGen is not a finite state transducer. Its ability to store matches in memory and freely reinsert them later allows it to easily model morphological processes that are inherently difficult to express with a finite state machine, such as reduplication and metathesis.

For a description of the rule format, have a look at section 4.2 of my term paper on the Malayalam Glosser, where MorphGen was first put to use. The files for Malayalam are also included in this repository under src/main/resources as an example.

A .jar library version of MorphGen 1.0 is downloadable here.

About

A language-independent morphological generator based on user-defined rule files parsed into an automaton-like structure.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages