New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update grammar to accept user defined literal suffixes #14054
Update grammar to accept user defined literal suffixes #14054
Conversation
AFAIU, whitespaces are ignored in the lexer and then values like the one below (from a failing test) are parsed by solidity/test/libsolidity/syntaxTests/literalSuffixes/application/invalid_suffix_no_whitespace.sol Line 4 in ce63256
The same already happened before for contract C {
uint x = 1000gwei;
} |
I see we have a Also, please add a syntax test for each of the whitespace characters that is allowed there - mine only has the space.
Good find. We should also do the same for subdenominations: fix the grammar and add syntax tests. |
7ef6987
to
4af093a
Compare
I tried it but doesn't work. I also found a problem with the function suffix1(uint) pure suffix suffix returns (uint) {}
// ----
// ParserError 2878: (82-88): Suffix already specified. |
Hmm... Indeed looks like a problem. Actually, why But anyway, one way out would be to split Another one would be to allow only either |
ok, so here's another idea. Looks like our scanner actually properly parses solidity/liblangutil/Scanner.cpp Lines 1004 to 1009 in 2ca349c
We could simulate this by recognizing a number followed by literal as its own thing. Try to define Also, to make sure it works, please add test cases for hex and fractional numbers (I suspect you'll need separate rules to handle them): |
Also, these grammar changes (i.e. split into free/contract function, number+identifier, renaming of |
test/libsolidity/syntaxTests/literalSuffixes/application/invalid_suffix_denomination.sol
Show resolved
Hide resolved
We do have tests for that, but they generate solidity/scripts/test_antlr_grammar.sh Lines 83 to 88 in 2ca349c
Ok, agreed. |
Interesting. Well, then maybe working around it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, looks good now apart from some minor tweaks.
Please now extract the generic part into a PR on develop as I suggested in an earlier comment (note that this will require changing some tests to use denominations rather than suffixes).
ce63256
to
4adfb76
Compare
By the way, please remember to mark PRs as reviewable once you're done implementing them. |
d3b8067
to
d133142
Compare
11e308c
to
c4aee73
Compare
I rebased the suffix PR on #14066, so if you rebase this one on the suffix PR, we can now continue review here. |
d133142
to
f54ddfc
Compare
docs/grammar/SolidityParser.g4
Outdated
( | ||
{!$mutabilitySet}? stateMutability {$mutabilitySet = true;} | ||
| {!$suffixSpecifierSet}? Suffix {$suffixSpecifierSet = true;} | ||
)* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With just mutability and suffix we could still do it without variables and keep it simple:
( | |
{!$mutabilitySet}? stateMutability {$mutabilitySet = true;} | |
| {!$suffixSpecifierSet}? Suffix {$suffixSpecifierSet = true;} | |
)* | |
stateMutability Suffix? | |
| Suffix stateMutability? |
docs/grammar/SolidityParser.g4
Outdated
|
||
literal: stringLiteral | numberLiteral | booleanLiteral | hexStringLiteral | unicodeStringLiteral; | ||
|
||
literalWithSubDenomination: numberLiteral SubDenomination; | ||
|
||
suffixedLiteral: literal identifier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, just realized that this is not entirely correct. It does not have to be a single literal. Can be member access as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I follow here. Should we be doing something like this?
suffixedLiteral: literal identifier; | |
memberAcess: identifier (Period identifier)+; | |
suffixedLiteral: (literal | memberAcess) identifier; |
What would be a member access that is equivalent to a literal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that something like 123 M.suffix
is allowed because you can do import "A.sol" as M
and A.sol
can have suffix
defined in it. identifier
alone does not allow it.
Not sure if we want to define memberAccess
explicitly though. I see that inside expression
that's defined in place like this:
expression Period (identifier | Address) # MemberAccess
So I guess the solution is either to repeat that for suffixedLiteral
or indeed define it as memberAccess
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok, understood it now. I was thinking about the literal
, not the identifier
.
We have a test with suffix function imported from other file:
solidity/test/libsolidity/syntaxTests/literalSuffixes/usableAsSuffix/imported_function_as_suffix.sol
Lines 6 to 14 in 65188ca
import "A.sol" as A; | |
import "B.sol" as B; | |
contract C { | |
function f() pure public { | |
1 s; | |
2 z; | |
3 A.s; | |
4 B.B.B.A.s; |
I have removed the failing test in chk_antlr_grammar
before running it locally and then got no errors, so I think that the current grammar already covers that somehow, but not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the current grammar already covers that somehow, but not sure.
But how? It can't be. expression
does account for that explicitly so the literals should have to as well.
docs/grammar/SolidityLexer.g4
Outdated
Ufixed: 'ufixed' | ('ufixed' [1-9][0-9]+ 'x' [1-9][0-9]+); | ||
Unchecked: 'unchecked'; | ||
Unicode: 'unicode'; | ||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so a keyword was missing here. This is a general change that should go straight to develop.
With a test case preferably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Created #14078 .
docs/grammar/SolidityParser.g4
Outdated
|
||
literal: stringLiteral | numberLiteral | booleanLiteral | hexStringLiteral | unicodeStringLiteral; | ||
|
||
literalWithSubDenomination: numberLiteral SubDenomination; | ||
|
||
suffixedLiteral: literal identifier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the current grammar already covers that somehow, but not sure.
But how? It can't be. expression
does account for that explicitly so the literals should have to as well.
Yeah, I am not sure why it works. I will investigate it. |
So, did you figure it out? |
Not yet. I will prioritize it though. |
I checked the solidity/test/libsolidity/syntaxTests/literalSuffixes/usableAsSuffix/imported_function_as_suffix.sol Lines 1 to 5 in 65188ca
Still, seems like the parser accepts member access as literal suffixes... |
You mean ANTLR or solc parser? Because the latter obviously does, but ANTLR should too. In any case, this explains why it did not fail. This means that the grammar is really incomplete, we just can't properly test it. It needs to be updated to cover this case. |
I meant
Are we going to do this in the context of the literal suffix PR or is it a task we leave for later? |
In this PR. Without it the grammar is incomplete.
This sounds weird. Is it only for suffixes or does it let you use dotted paths in other places that grammar does not officially include? |
I am trying to find if there's other cases. |
Oh, that's very helpful. I see what's happening there. Basically, ANTLR sees it as if it was |
c4aee73
to
559d62f
Compare
I pushed a new version of the base PR. Please rebase. |
559d62f
to
f229c36
Compare
b21d2f9
to
1dc56e6
Compare
1dc56e6
to
a1059cf
Compare
@@ -129,7 +129,8 @@ done < <( | |||
grep -v -E 'license/license_hidden_unicode.sol' | | |||
grep -v -E 'license/license_unicode.sol' | | |||
# Skipping tests with 'something.address' as 'address' as the grammar fails on those | |||
grep -v -E 'inlineAssembly/external_function_pointer_address.*.sol' | |||
grep -v -E 'inlineAssembly/external_function_pointer_address.*.sol' | | |||
grep -v -E 'literalSuffixes/application/invalid_address_member_on_suffix.sol' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because I noticed we already had a workaround for the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, looks good now!
Task of #12656.