Skip to content

An alternative grammar for parsing mathematical expressions for LaTeXML, implemented in Marpa::R2

License

Notifications You must be signed in to change notification settings

dginev/LaTeXML-Plugin-MathSyntax

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

85 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Note April 2019: This plugin has been frozen in the last 5 years and has fallen out of alignment with the latexml master. Gradual work on repairing and modernizing it has begun, but third-party use is not advised until a new release is made.

An alternative grammar for parsing mathematical expressions for LaTeXML, implemented in Marpa::R2.

The grammar is loosely motivated by The Structure of Mathematical Expressions, and explores designing for ambiguous parses with follow-up semantic pruning.

perl Makefile.PL ; make
make test # Tests LaTeXML::MathSyntax
make test_classic # Tests LaTeXML::MathGrammar

Aims to also provide:

  • A comprehensive test suite for parsing mathematical expressions into content MathML
  • An automated test harness for developing math grammars for LaTeXML
  • Convenient domain-specific languages for writing down tests (e.g. a write-efficient serialization for Content MathML)

Dataset

  • Gradually formalising all of DLMF's ~10,000 equations.
  • Intending to formalize ~1,000 or more formulas from arXiv.org

Annotations

  • Each item in the test suite consists of a <TeX formula, Content MathML tree> tuple.
  • The TeX can be any input written in a real-world mathematical document.
  • The Content MathML tree must be written in the strict CMML syntax, complying with existing content dictionaries.
  • The CMML tree must avoid large semantic rewrites of the input expression.

Example: In the case of an ellipssis (...) the annotation should ideally preserve the ellipsis explicitly via a special underspecified symbol, such as :underspecified:ellipsis. Full formalization is not only difficult, but also not always possible, for ellided content. The rule of thumb should always be to consider whether a math grammar should realistically be expected to perform the analysis necessary for the rewrite.

About

An alternative grammar for parsing mathematical expressions for LaTeXML, implemented in Marpa::R2

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages