New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify the parser #1596
base: master
Are you sure you want to change the base?
Simplify the parser #1596
Conversation
Here is an alternative formatting: st_expr
: expr { $$ = $1; p.res = $$; }
;
expr
: expr '+' expr { $$ = add($1, $3); }
| expr '-' expr { $$ = sub($1, $3); }
| expr '*' expr { $$ = mul($1, $3); }
| expr '/' expr { $$ = div($1, $3); }
| expr POW expr { $$ = pow($1, $3); }
| expr '<' expr { $$ = Lt($1, $3); }
| expr '>' expr { $$ = Gt($1, $3); }
| expr LE expr { $$ = Le($1, $3); }
| expr GE expr { $$ = Ge($1, $3); }
| expr EQ expr { $$ = Eq($1, $3); }
| expr '|' expr {
set_boolean s;
s.insert(rcp_static_cast<const Boolean>($1));
s.insert(rcp_static_cast<const Boolean>($3));
$$ = logical_or(s); }
| expr '&' expr {
set_boolean s;
s.insert(rcp_static_cast<const Boolean>($1));
s.insert(rcp_static_cast<const Boolean>($3));
$$ = logical_and(s); }
| expr '^' expr {
vec_boolean s;
s.push_back(rcp_static_cast<const Boolean>($1));
s.push_back(rcp_static_cast<const Boolean>($3));
$$ = logical_xor(s); }
| '(' expr ')' { $$ = $2; }
| '-' expr %prec UMINUS { $$ = neg($2); }
| '~' expr %prec NOT {
$$ = logical_not(rcp_static_cast<const Boolean>($2)); }
| IDENTIFIER { $$ = p.parse_identifier($1); }
| NUMERIC { $$ = p.parse_numeric($1); }
| IDENTIFIER '(' expr_list ')' { $$ = p.functionify($1, $3); }
| IMPLICIT_MUL {
auto tup = p.parse_implicit_mul($1);
$$ = mul(std::get<0>(tup), std::get<1>(tup)); }
| IMPLICIT_MUL POW expr {
auto tup = p.parse_implicit_mul($1);
$$ = mul(std::get<0>(tup), pow(std::get<1>(tup), $3)); }
;
expr_list
: expr_list ',' expr { $$ = $1; $$.push_back($3); }
| expr { $$ = vec_basic(1, $1); }
; @isuruf compared to the one in the PR, which one do you think looks better? I can't quite decide where to put the |
Here is another alternative, I probably like this one the most so far: st_expr
: expr { $$ = $1; p.res = $$; }
;
expr
: expr '+' expr { $$ = add($1, $3); }
| expr '-' expr { $$ = sub($1, $3); }
| expr '*' expr { $$ = mul($1, $3); }
| expr '/' expr { $$ = div($1, $3); }
| expr POW expr { $$ = pow($1, $3); }
| expr '<' expr { $$ = Lt($1, $3); }
| expr '>' expr { $$ = Gt($1, $3); }
| expr LE expr { $$ = Le($1, $3); }
| expr GE expr { $$ = Ge($1, $3); }
| expr EQ expr { $$ = Eq($1, $3); }
| expr '|' expr {
set_boolean s;
s.insert(rcp_static_cast<const Boolean>($1));
s.insert(rcp_static_cast<const Boolean>($3));
$$ = logical_or(s); }
| expr '&' expr {
set_boolean s;
s.insert(rcp_static_cast<const Boolean>($1));
s.insert(rcp_static_cast<const Boolean>($3));
$$ = logical_and(s); }
| expr '^' expr {
vec_boolean s;
s.push_back(rcp_static_cast<const Boolean>($1));
s.push_back(rcp_static_cast<const Boolean>($3));
$$ = logical_xor(s); }
| '(' expr ')' { $$ = $2; }
| '-' expr %prec UMINUS { $$ = neg($2); }
| '~' expr %prec NOT {
$$ = logical_not(rcp_static_cast<const Boolean>($2)); }
| IDENTIFIER { $$ = p.parse_identifier($1); }
| NUMERIC { $$ = p.parse_numeric($1); }
| IDENTIFIER '(' expr_list ')' { $$ = p.functionify($1, $3); }
| IMPLICIT_MUL {
auto tup = p.parse_implicit_mul($1);
$$ = mul(std::get<0>(tup), std::get<1>(tup)); }
| IMPLICIT_MUL POW expr {
auto tup = p.parse_implicit_mul($1);
$$ = mul(std::get<0>(tup), pow(std::get<1>(tup), $3)); }
;
expr_list
: expr_list ',' expr { $$ = $1; $$.push_back($3); }
| expr { $$ = vec_basic(1, $1); }
; |
I like the last one the most, so I pushed it in. The This approach also encourages to keep the semantic action short: ideally code like | expr '|' expr {
set_boolean s;
s.insert(rcp_static_cast<const Boolean>($1));
s.insert(rcp_static_cast<const Boolean>($3));
$$ = logical_or(s); } should probably be replaced with something like: | expr '|' expr { $$ = make_or($1, $3); } |
} else { | ||
sym = parse_identifier(lexpr); | ||
} | ||
SYMENGINE_ASSERT(lexpr.length() > 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather this throw an exception even in Release mode instead of silently ignoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give an example when this can happen? I couldn't figure out any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if we cannot figure out when this could happen for some user input, then the assert statement is appropriate. Asserts are for ensuring that our assumptions in the code are consistent/valid, but those do not run in release mode, because we assume our code is correct in release mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, there was some ambiguity in the grammar where 2e10 was parsed as an implicit mul.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if I can trigger it. Too bad there wasn't a test for this. All parsing tests pass. We need to test this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite frankly, I don't like this implicit mul parsing, due to such ambiguities. Do you think people want implicit mul to be parsed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think people want implicit mul to be parsed?
Yes, julia people use this feature heavily in SymEngine.jl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I'm no fan of implicit mul.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are at it, we should also fix #1462. We can even have a few different parsers if needed.
I don't have time to finish this before a release, let's do this PR after the release. |
Several simplifications were done:
parse_implicit_mul()
cannot return an error, added an assert and simplified theparser.yy
parser.yy
: merged all the rules that could be merged; changed syntax that makes it more readable;rcp_static_cast
casts were removed as they are not needed