New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] experimental Spirit X3 hacks #3976
base: master
Are you sure you want to change the base?
Conversation
I'll try to explain how this works on a simple grammar. Let's parse a subset of JSON that only knows booleans, numbers, and arrays. Rules that are not involved in recursive cycles, and don't need to be used from other sources, are defined with // grammar_def.hpp
MAPNIK_SPIRIT_LOCAL_RULE(r_boolean, bool)
= "false" >> x3::attr(false)
| "true" >> x3::attr(true)
; When there is a recursive cycle, some rule(s) must be introduced first with // grammar_def.hpp
MAPNIK_SPIRIT_RULE(r_value, my_value_type);
MAPNIK_SPIRIT_LOCAL_RULE(r_array, my_array_type)
= '[' > -(r_value % ',') > ']'
;
MAPNIK_SPIRIT_RULE_DEF(r_value)
= x3::double_
| r_boolean
| r_array
; When a rule is to be used from other grammars/sources, it goes like this: // grammar.hpp
MAPNIK_SPIRIT_EXTERN_RULE(r_start_rule, boost::optional<my_value_type>);
// grammar_def.hpp
MAPNIK_SPIRIT_EXTERN_RULE_DEF(r_start_rule)
= r_value
| "null"
;
// grammar.cpp
MAPNIK_SPIRIT_INSTANTIATE(r_start_rule, iterator_type, context_type); And finally, all rules must have their fancy names (those that appear in // grammar.cpp
MAPNIK_SPIRIT_RULE_NAME(r_boolean) = "Simpleson boolean";
MAPNIK_SPIRIT_RULE_NAME(r_value) = "Simpleson value";
MAPNIK_SPIRIT_RULE_NAME(r_array) = "Simpleson array";
MAPNIK_SPIRIT_RULE_NAME(r_start_rule) = "Simpleson message"; |
Codecov Report
@@ Coverage Diff @@
## master #3976 +/- ##
==========================================
+ Coverage 71.44% 71.47% +0.02%
==========================================
Files 439 441 +2
Lines 22885 22837 -48
==========================================
- Hits 16351 16322 -29
+ Misses 6534 6515 -19
Continue to review full report at Codecov.
|
This is an experiment I started because I hate how Spirit X3 separates rules from parsing expressions with the
_def
suffix. Jumping between where a rule is used in the grammar and where it's defined is awkward because they're different identifiers. Well, I could make vim do that with a keystroke, but I chose not to write vimscript for this.C++ allows a name to refer to a type and a variable (or function) at the same time. They don't shadow each other. Although the variable/function takes precedence where it's syntactically allowed, you can always get to the type with an explicit class-key (
struct
,enum
, ...). So I use the same name for theTag
type used to differentiatex3::rule
specializations, and for the variable used in parsing expressions.Rule parsing expressions are stored in a variable template specialized on the
Tag
type. After testing a few variations of this approach, I found it's already been explored (djowel/spirit_x3#17).So there are no suffixes, everything related to a rule is designated by a single name.
The
MyRule
variable could've been simply anx3::rule
instance, but that would suffer from initialization order issues becausex3::rule
constructor is notconstexpr
. So I instead made thestruct MyRule
type usable in parsing expressions (it converts to anx3::rule
).So far I've only converted TopoJSON grammar to this style (despite my opinion that Spirit is not the right tool to parse JSON-based protocols and that this grammar is flawed, I had my hands on it when I got the idea :)). With gcc-6,
topojson_grammar_x3.o
size dropped by ~30k (~10%), that accounts for ~1% oflibmapnik-json.a
. I'm curious whether other grammars will also shrink. I kinda expected the opposite, as I madex3::rule
type signatures longer. Perhaps the reduction is due toBOOST_SPIRIT_DEFINE
putting parsing expressions in function-local static variables, whose names are really long, because they include both the function type and the variable type.