Skip to content
/ lex Public

Linear expressions - simple and fast pattern matching

Notifications You must be signed in to change notification settings

jbee/lex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 

Repository files navigation

▓   ▓▓▓ ▓ ▓
▓   ▓   ▓ ▓ 
▓   ▓▓▓  ▓
▓   ▓   ▓ ▓
▓▓▓ ▓▓▓ ▓ ▓ Linear expressions

First match pattern matching with a tiny bytecode instructed matching machine.

Spec      http://jbee.github.io/lex/
Feedback  http://github.com/jbee/lex/issues

Copyright (c) 2017 Jan Bernitt


________________________________________________________________________________

IMPLEMENTAIONS

Java: basic ~100 LOC, optimized ~150 LOC


________________________________________________________________________________

SETS         {...}

{abc}        a set of bytes "a","b" and "c"
{^abc}       a set of any byte but "a","b" and "c"
{a-c}        a set of "a", "b" and "c" given as a range
{?}          a set of *all* non ASCII bytes


SPECIAL SETS

# = {0-9}    any ASCII digit
@ = {a-zA-Z} any ASCII letter
$            any ASCII new line (\n or \r)
_            any ASCII whitespace character
^            any byte that is not an ASCII whitespace character
?            any single byte


REPETITION   x+

+            try previous set, group or literal again


GROUPS       (...), [...], `...`

(abc)        a group with sequence "abc" that *must* occur
[abc]        a group with sequence "abc" that *can* occur
`            exit group, unless first in group (used to embed)


SCANNING     ~x

~            skip until following set, group or literal matches


ESCAPING

\            escape following byte to a literal (also in set)

Sets can also be used to match most of the instruction symbols literally.
For example {~} is similar to \~ or {\~}.

Any other byte (not {}()[]#@^_$+~?`\) is matched literally. 
Escaping can be applied to any byte even if it is not needed.

________________________________________________________________________________

EXAMPLES

####/##/##        a date of format yyyy/mm/dd
##:##[:##]        a time of format hh:mm or hh:mm:ss
#+[.#+]           a simple floating point number with optional decimals
"{^"}+"           a quoted string using a set to find the end
"~"               a quoted string using a scan to find the end
"~({^\\}")        a quoted string using a scan with escaping support
\$@}a-zA-Z0-9_{+  a php style identifier
[\+]#+[{ -}#+]+   international phone numbers
~(Foo)            searching for "Foo" (e.g. in a file)
~(<h#>)           searching for "<h0>" to "<h9>"
~(<h{1-6}>)       searching for "<h1>" to "<h6>"
~(Foo~(Bar))      searching for "Foo"s followed by "Bar"s

About

Linear expressions - simple and fast pattern matching

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages