Skip to content

foreverbell/parakeet

Repository files navigation

parakeet

Disclaimer

I know nothing about parsing, Haskell and Japanese. This repository is purely for fun, and serving as a test-bed for some Haskell experiments.

Introduction

Build the most convenient tool for Japanese beginners!

Input:

Output:

Full output:

Romaji should follow Hepburn romanization.

For an experimential online demo powered by GHCJS, see here for more details.

Installation

To build, at least ghc 7.10.2 is required.

$ cabal install parakeet.cabal

For stack users,

$ stack init
$ stack install

or

$ stack install --stack-yaml=stack-ghcjs.yaml

if you want to compile to JavaScript.

Development

$ cabal sandbox init
$ cabal install --only-dependencies
$ cabal build

Usage

  • XeLaTex package dependencies: xeCJK, ruby
  • Font dependencies: MS Mincho, MS Gothic
$ parakeet -j Butter-fly.j -r Butter-fly.r -o Butter-fly.tex
$ xelatex Butter-fly.tex

or directly,

$ parakeet -j Buffer-fly.j -r Buffer-fly.r -o Buffer-fly.pdf

You should guarantee that the two input files are encoded in UTF-8.

Limitations

  • The parsing algorithm is essentially LL(infinity), it is an exponential algorithm of course! So the program may get extremely slow when there is a mistake in a long line of romaji. A proper use of separator $ can avoid this trap.
  • The long vowel ō is ambiguous in Hepburn romanization, which is interpreted to ou or oo. To resolve this, we always pick the former one. For example, 東京(Tōkyō) is correctly translated to とうきょう, while 大阪(Ōsaka) is wrongly translated to おうさか.
  • There are two zus and jis in romanization, namely ずづ and じぢ in hiragana respectively. We always pick ずじ when translating zu and ji into furigana. If you want づぢ, use du(dzu) and di(dji) instead.
  • Unfriendly parse error message.

Document

Since I haven't find any potential users, so there will be no document available, please create an issue if you have trouble using it.

TODO List

  • Ambiguous ō warning.
  • Extended katakana support.
  • Wiki for Japanese lexical rules.

Releases

No releases published

Packages

No packages published