Skip to content

Commit

Permalink
doc updates
Browse files Browse the repository at this point in the history
  • Loading branch information
mhhollomon committed Oct 1, 2020
1 parent 64652c4 commit 40aba58
Show file tree
Hide file tree
Showing 2 changed files with 76 additions and 46 deletions.
111 changes: 65 additions & 46 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,13 @@
# Yet Another LR Parser Generator
## Yalr Release 0.2.0
## Yalr Release 0.2.1
[![Github Releases](https://img.shields.io/github/release/mhhollomon/yalr.svg)](https://github.com/mhhollomon/yalr/releases)
[![Build Status](https://api.cirrus-ci.com/github/mhhollomon/yalr.svg)](https://cirrus-ci.com/github/mhhollomon/yalr)
[![Github Issues](https://img.shields.io/github/issues/mhhollomon/yalr.svg)](http://github.com/mhhollomon/yalr)
[![GitHub License](https://img.shields.io/badge/license-MIT-blue.svg)](https://raw.githubusercontent.com/mhhollomon/yalr/master/LICENSE)

## Release Highlights

- Case insensitive lexer. Turn on case folding for the entire lexer or only
select terminals.

- Precedence and associativity markers - you can now give rules and terminals
precedence in order to help resolve grammar ambiguities.

- Better error messages - The error message system has been completely
revamped. The message should now be cleaner and easier to understand.

- Autogenerated `main()`. Yalr will create a main for you, if you want.
- Small bug fixes and documentation clean ups.

For more details, see below and the [Release Notes](RELEASE_NOTES.md)

Expand Down Expand Up @@ -114,7 +105,7 @@ This statement may only appear once in the file.

### Option statements

A number settings can be changed via an option statement. The general syntax
A number of settings can be changed via the option statement. The general syntax
is:

```
Expand All @@ -126,8 +117,7 @@ The available options are:
option-id | setting
----------|---------
lexer.case| default case matching. Setting is `cfold` and `cmatch`
code.main | When set to true, will case the generator to include a simple
main() function (See below).
code.main | When set to true, will cause the generator to include a simple main() function (See below).

### Terminals

Expand All @@ -136,7 +126,7 @@ There are two types of terminals - "parser" terminals and "lexer" terminals.
#### Parser Terminals

Parser Terminals are those terminals that are used to create the rules in
grammar. These are the terminals that are return by the lexer.
the grammar. These are the terminals that are return by the lexer.

Parser Terminals are defined by the `term` keyword.

Expand Down Expand Up @@ -188,7 +178,7 @@ term PRINT_KEYWORD 'print' @cfold ;
```

The computation is given as an action encased in `<%{ ... }%>` . If an action
is given, then the normal terminating semi-colon is not required.
is given, then the normal terminating semi-colon is not allowed.

```
term <int> INTEGER r:[-+]?[0-9]+ <%{ return std::stoi(lexeme); }%>
Expand Down Expand Up @@ -216,28 +206,28 @@ how they treat case.

prefix | behavior
-------|---------
`r:` | Default case behavior (currently case sensitive).
`r:` | Global "default" case behavior as potentially set using `option lexer.case` statement.
`rm:` | Match case - ie case sensitive.
`rf:` | Fold case - i.e. case insensitive.

##### @lexeme special type

The special type `@lexeme` can be used to give a short cut for the common
The special type `@lexeme` can be used as a short cut for the common
pattern of returning the parsed text as the semantic value.

When a terminal is given the type `@lexeme`, this is transformed internally
into `std::string`. Additionally, the action set to return the lexeme. If the
into `std::string`. Additionally, the action is set to return the lexeme. If the
terminal is given an action, this is an error.

```yalr
// This
term <@lexeme> IDENT r:_*[a-zA-Z]+ ;
// becomes this:
// acts like:
term <std::string> IDENT r:_*[a-zA-Z]+ <%{ return std::move(lexeme); }%>
// THIS is an ERROR
term <@lexeme IDENT r:_*[a-zA-Z]+ <%{ /* blah, blah */ }%>
term <@lexeme> IDENT r:_*[a-zA-Z]+ <%{ /* blah, blah */ }%>
```

##### Terminal Precedence and Associativity
Expand All @@ -258,7 +248,7 @@ term Mult '*' @assoc=left ;
The `left` or `right` keyword must come directly after the flag. There can be
no spaces btween the equal sign and the value.

Precedence is assgined to the terminal using the `@prec=` flag. It can be
Precedence is assigned to the terminal using the `@prec=` flag. It can be
assigned as a positive integer value, or as the name or pattern of another
terminal. The referenced terminal must have a precedence assigned.

Expand Down Expand Up @@ -296,7 +286,7 @@ rule Foo { => WS ; }

### Associativity statement

Terms can be given an assoviativity setting using the `associativity`
Terms can be given an associativity setting using the `associativity`
statement. This statment will also create single-quote style terminals "inline"

```yalr
Expand Down Expand Up @@ -364,7 +354,7 @@ value. The semantic values of the items in the production are available to the
actions in variables of the form `_v{n}` where `{n}` is the position of the
item from the left numbered from 1. An item may also be given an alias. This
alias will be used to create a reference variable that points to the
corresponding semantic avalue variable. If an represents a rule or terminal
corresponding semantic avalue variable. If an item represents a rule or terminal
without a type, the corresponding semantic variable will not be defined.
Giving an alias to such an item will result in an error.

Expand Down Expand Up @@ -466,21 +456,21 @@ rule E {

and the input `1 + 2 * 3`.

Would like it to parse it as `1 + ( 2 * 3)` - that is - use the second
We would like `yalr` to parse this as `1 + ( 2 * 3)` - that is - use the second
production first and then use the first production to create the parse tree :
```
E(+ E(1) E(* E(2) E(3)))
```

The important point is after it has seen (and shifted) '1' '+' '2' and is
deciding what to do with the `*`. it has a choice, it can shift it and delay
The critical point in the parse is after it has seen (and shifted) '1' '+' '2' and is
deciding what to do with the `*`. the system has a choice, it can shift the `*` and delay
reducing until later (this is what we want it to do), or it can go ahead and
reduce by production 1.
reduce by production 1.

When there is a shift/reduce conflict like this thegenerator will compare the
precedence of the production (1) and the terminal(`*`). If the production is
greater, then the reduce will be done. If the terminal is higher precedence,
then the shift will done.
When there is a shift/reduce conflict like this, the generator will compare the
precedence of the production (1) and the terminal (`*`). If the precedence of
the production is greater, then the reduce will be done. If the terminal has higher precedence,
then the shift will be done.

If the two have equal precedence, the associativity of the terminal will be
consulted. If it is 'left' then reduce will be done. If it is 'right', then the
Expand All @@ -492,12 +482,41 @@ It is also possible to have two rules come in conflict (reduce/reduce). The
same rules apply.

So, to make our example act as we want, we need to make `*` have a higher
precedence than production 1. By default it will have the precedence of the `+`
terminal.
precedence than production 1.

There are several ways to do this. We could directly assign precedence the rule
and the terminal.
```
term P '+' ;
term M '*' @prec=200 ;
rule E {
=> E '+' E @prec=1; // production 1
=> E '*' E ; // production 2
=> number ; // production 3
}
```

By default production 1 will have the precedence of the `+` terminal.
So, we could also set the precedence of '+'.

```
term P '+' @prec=1 ;
term M '*' @prec=200 ;
rule E {
=> E '+' E ; // production 1
=> E '*' E ; // production 2
=> number ; // production 3
}
```

We could also set the associativity in order to invoke the second part of the
conflict resolution rules.

```
term P '+' @prec=1
term M '*' @prec=200
associativity right '+'
associativity left '*'
rule E {
=> E '+' E ; // production 1
Expand Down Expand Up @@ -581,14 +600,14 @@ Each state will have a block of descriptive information such as this sample
--------- State 1
Items:
[ 3] statement => PRINT * expression
[ 5] expression => * expression '+' expression
[ 6] expression => * expression '-' expression
[ 7] expression => * expression '*' expression
[ 8] expression => * expression '/' expression
[ 9] expression => * NUMBER
[ 10] expression => * VARIABLE
[ 11] expression => * '(' expression ')'
[ 3] statement => PRINT |*| expression
[ 5] expression => |*| expression '+' expression
[ 6] expression => |*| expression '-' expression
[ 7] expression => |*| expression '*' expression
[ 8] expression => |*| expression '/' expression
[ 9] expression => |*| NUMBER
[ 10] expression => |*| VARIABLE
[ 11] expression => |*| '(' expression ')'
Actions:
VARIABLE => shift and move to state 8
Expand All @@ -602,15 +621,15 @@ Gotos:
#### Items

This section lists the current partial parses this state represents. The
unquted star is the pointer to where the parse currenty is in this state.
`|*|` symobl is the pointer to where the parse currenty is in this state.

#### Actions

What do to for each possible token that could be received next. Any tokens not
listed are considered errors and the parse will terminate.

Possible actions are:
- shift - add the token and its value to the stack and move the the designated
- shift - Add the token and its value to the stack and move to the designated
new state.
- reduce - For the listed production, pull the correct number of items off the
stack and run the action code associated with the production. Shift the token
Expand Down
11 changes: 11 additions & 0 deletions RELEASE_NOTES.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
## Release v0.2.1

### Functional Changes

- bug #14 - make sure to fail if a void terminal is given an alias.

### Non-functional Changes

- doc clean up
- testing improvements

## Release v0.2.0

### Functional Changes
Expand Down

0 comments on commit 40aba58

Please sign in to comment.