New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mustache parser #11
Mustache parser #11
Conversation
dcd2d54
to
d412638
Compare
We're now able to render nested items:
Will yield:
|
From the mustache man page:
ExampleI'll try to explain this in detail. There may be a bit of text, but it will hopefully avoid any miscommunication. Let's take this piece of data:
For a template like this:
The expected output would be:
Note that at [3], we're accessing cast.name and movie.movie-name (from a parent context). To accomplish this, the mustache parser tries to use 'hopen/ctx as a stack. As each section/block/loop is encountered, the item/data we're iterating on is is pushed into the stack a child context. When we need to look up a field, we don't just look up the field in the last child context, we look through all open contexts recorded on the stack, until we either locate the field somewhere along the chain, or come up with nil. Applied to this example... Then at [1], we do a ctx-lookup on
At [2], we do another ctx-lookup for
At [3], when we look up That was a bit long winded, but I hope it was clear. =) |
@Odie Thanks for the very long explanation. I did not know that Mustache was behaving like that, and I am totally surprised and scared 😨 to see it as it seems to me that it is a very poor way to reference data, causing bugs when more fields are added to maps which are in the most inner contexts. That's the opposite of the Clojure's culture which makes sure that there is zero misunderstanding by giving namespaces to variables. It makes me wonder if we should import this behavior in hopen or deviate from Mustache's spec and fix it. Technically, I am thinking about another way to implement Mustache's variable resolution without a stack, that would be to merge the context's maps together as you enter the inner blocks. You would then only need 1 lookup to resolve a variable using a normal |
src/hopen/renderer/xf.cljc
Outdated
(defn- ctx-lookup | ||
"Tries to look up a field name in every context starting from most nested context first." | ||
[env field-name] | ||
(some #(get % (keyword field-name)) (reverse (get-in env [:bindings 'hopen/ctx] [])))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function may only be used by the Mustache syntax. I would prefer it to live in one of the mustache syntax's namespace and not been added to the default-env of this namepace.
It means that you would need to define a default-env in the mustache syntax's namespace as well, which you could express as an update of the one which is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update: the code being refactored, I mean based on hopen.renderer.env
.
I think mustache “has” to have this sort of variable resolution because it’s syntax gives no way today which scope/ctx you want a variable from. In clojure (or any reasonable programming language), the bindings have unique names so it’s perfectly clear where you’re getting data from.
But... Mustache is “logic-less” and doesn’t even provide a way to define a binding. I’m pretty sure we don’t want to import or duplicate this kind of behavior elsewhere. In my long winded exmaple, the shape of my data can’t even have That said, I'm not sure if deviating from the "mustache" spec is useful. Because, then, it wouldn't be mustache, and you lose any benefits from maybe being able to use existing mustache templates or whatnot.
I don't have particular preferences here. If you want to keep ctx to a single map instead of a stack of maps, that's okay too. Merging maps is just as easy as pushing something onto a stack. |
I am proposing this deviation from the official spec: We could extend the syntax a bit to have a way to refer which context should be used to resolve a variable. The idea is a little like using Example:
I let you decide if you want to use it or not. |
That’s certainly possible, but then we wouldn’t be able to merge the ctx maps, which would remove the contex level info. Anyway, I’ll keep this in mind. Let’s revisit this when I get through with some of the other stuff. |
We would not need to merge anything anymore. The data-template would become something like: '[(b/for [mustache/symb_0 (hopen/ctx :movies)]
[(b/for [mustache/symb_1 (mustache/symb_0 :cast)]
[(mustache/symb_1 :name) " was in " (mustache/symb_0 :movie-name)])])] |
Hmm... what’s the goal of changing what the parser outputs? If we aim for this kind of parser output, the parser will have to become “smarter” (aka more complicated) to keep track of embed/nested levels and assign them a unique symbol on bind and on lookup. But, we still have distinct, multiple open contexts. They are just now have individual bindings instead of being in a stack, making recursive lookups harder. Relative context fetches isn’t easier either. With a stack, it’s possible to grab (nth stack (- (count stack) n)) to access the nth parent. With named bindings, again, the exact nested level needs to be known. The kind of transformation we’re performing on the tree is really straightforward, and free of requiring any information regarding where the node is in the overall program. It should be more than capable of implementing what the spec asks for. |
I think that it is not as hard as you imagine. The only thing the parser needs to track is the current level of nesting of the context, use it as a base to count backward when the parser meet the
|
FYI, I found a discussion on this topic janl/mustache.js#399 |
The thing I take issue with isn’t about “how hard” it might be. Certainly, we can count those nested levels, and we can generate those symbols, and we can calculate the nested level difference to fetch the right context. But what is already in place is perfectly able to address parent contexts unambiguously if that’s what we want. It can be made to deal with recursive lookups and it can also be made to require explicit references to parent context. A stack is perfect for this. So, to me, adding additional housekeeping to generate this specific shape of hopen data, introduces additional complexity but does not give us any additional features in return. Thus, my question still is, why are we after this specific output shape? |
Your implementation is not ambiguous. My concern is the Mustache's specs itself, I think that the spec is ambiguous and that's a big issue to me. I regret having referred to Mustache when I created the issue, have I had done more research I would have chosen the spec of handlebar instead.
It was to deviate from the spec of Mustache and fix its ambiguous behavior. |
Okay. It's fine if you want to extend or deviate from the mustache spec. What I've been trying to say is, using a stack to track contexts will work perfectly well for that. Given a stack of open contexts, we can unambiguously locate parent contexts:
We don't perform any additional housekeeping (nested level count, var symbol generation, etc). With a stack, we can already unambiguously access the exact context the user might ask for. Very little needs to change in the parser to support this. |
- Partials and Lambdas have been left unimplemented - Moves mustache specific code out of xf.cljc - Adds to test to verify feature functionality
Tag contents are now parsed and tagged inside `assoc-tag-type` only. It also takes care of dealing with unexpected whitespace properly. The rest of the code no longer assumes the exact format of the tag string, but relies on the cleaned info to be on the tag itself.
The following was done to get this to happen: - Move backtick into cljc - Reimplement `re-seq-pos` for cljs - Use `ex-info` when throwing errors - Moves `render-` into test file - Adds parse handling of escape tags - Adds humane-test-output for readable test output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Globally, I found the source code very hard to read. The functions are too big also, it makes them harder to reason with and test.
I recommend spending a lot of time refactoring it.
true | ||
(reduced false)) | ||
args)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the and
and or
is a good idea. However, doing it in a unrelated branch is not a good idea.
If you need something to be added in the renderer, I would prefer that you let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the wrote wording. They are related, but adding new functions to the default environment is not the subject of your PR.
PR are either accepted or rejected (and maybe updated and reviewed again). There is no in-between like taking some of the parts of the commits. For this reason, I advise opening a separate PR for modifying stuffs which are not the main purpose of the main PR. It would be easier to discuss, possibly amend and been merged.
The idea is similar to the single responsibility objects or function: it is better not to have more than 1 reason for the object / function to be obsolete and be refactored - for the PR it means to only do 1 subject at a time. The more subjects in a PR, the higher the probability of having 1 subject on which people would disagree, preventing the merge for any of the code addressing any of the subjects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this parser, having and
and or
is a required part of making it actually work so it is well within the "subject" of this branch.
This parser is the first concrete parser to actually make use of and fully exercise the hopen renderer. It's only natural to find that there are many missing pieces in the default environment that are useful or required to support the parsers.
Following what you're describing, it would take time and effort a to create a new PR, fully explain why it's needed, wait for code review, and approval, just so it can eventually be merged back into this branch so the work can continue. Why would I do that? How much additional time do you want me to put into stuff like this?
I'm the only one working on this particular parser, in my free-time, for the (hopefully) fun of being able to work with some people in clojure locally. I don't understand your insistence that I jump through multiple hoops to satisfy some kind of strange PR structure that isn't explained anywhere and is outside of common github practices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this parser, having
and
andor
is a required part of making it actually work so it is well within the "subject" of this branch.
I understand that.
This parser is the first concrete parser to actually make use of and fully exercise the hopen renderer. It's only natural to find that there are many missing pieces in the default environment that are useful or required to support the parsers.
I agree that.
Following what you're describing, it would take time and effort a to create a new PR, fully explain why it's needed, wait for code review, and approval, just so it can eventually be merged back into this branch so the work can continue. Why would I do that? How much additional time do you want me to put into stuff like this?
That's why I was suggesting that you let me know, so I can add it faster.
When I add things in the renderer's default environment, I am considering all the parsers at the same time, and need to carefully choose what should be in the defaults and what should not. That's why I prefer it to be done independently from the parsers, in a separate branch.
I'm the only one working on this particular parser, in my free-time, for the (hopefully) fun of being able to work with some people in clojure locally. I don't understand your insistence that I jump through multiple hoops to satisfy some kind of strange PR structure that isn't explained anywhere and is outside of common github practices.
If it not fun anymore, I suggest you to stop working on it. I know that my standard for quality is way higher than the average and it can be very frustrating for you to have some code that starts to work but that don't get merged in the project yet.
I would have been more willing to merge this PR if it was contained in its own namespace and not affecting the rest of the code. At the moment, this PR:
- Added the source code of backtick in its root.
- Added a test framework that modifies the test output's format.
- Added some code to the renderer.
- Added some variable to the default environment.
A PR being merged or not merge (with nothing in between), I don't want to merge it as it is now.
.. and I am not sure if addressing all those listed issues would solve the problem. There are other problems with the source code, I think it should be simpler.
In the Gitter chat room, I was proposing you to use the "Segment" type from the instaparse library which let you have substrings which keep references to the original string with an index and size information. Using it would make a big chunk of your source code simpler.
I did not fully understood how you use the line numbers in your parser, but if it is just for error reporting, you may as well calculate them once you need to output an error, it may result in a more decoupled source code.
My last feedback would be more an advice, it's to give instaparse a try. It has the potential to make a big part of your code simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s been a strange short journey for sure. The 4 points that you cited as the reason for not accepting the PR are all quite superficial and extremely minor problems.
It really is no big deal if you don’t accept the PR. I wouldn’t mind in the least even if we eventually decide to throw out this implementation because things could be done much more easily and cleanly in instaparse. Its the learning, the discussion, and the journey that makes it fun.
What I do mind, is that although the original invitation was to work on something “together” so the community has something to discuss over meetups, you’ve now assigned yourself the boss. Your expect your opinions to be unquestioningly followed for some reason.
Since you want to make this whole process intentionally difficult and this has definitely ceased being “fun”, I’m closing the PR.
@@ -26,4 +28,5 @@ | |||
:optimizations :none | |||
:pretty-print false}}]} | |||
:doo {:alias {:browsers [:chrome :firefox]} | |||
:paths {:karma "./node_modules/karma/bin/karma"}}) | |||
:paths {:karma "./node_modules/karma/bin/karma"}} | |||
:profiles {:dev {:dependencies [[pjstadig/humane-test-output "0.8.2"]]}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in your local ~/.lein/profiles.clj
instead of in the project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looked like I had to activate it in runner.cljs
like this. I don't think it would compile if I leave out the dependency.
This appears to be the way to add dependencies that should only be included for dev or test time. link
I think having a default test reporter for both clj and cljs would be useful to anyone who might want to participate in working on this project.
The main problem is that the default test reporter show something like this for a failing case:
It's quite difficult to figure out what's actually different here. I've been using eftest with clj so far (added via profile.clj). It prints the error this way.
That's much more readable! The last diff
section also makes it clear what is actually different in these two pieces of data. Kaocha actually produces even better looking output (and it's faster to run for some magical reason).
Both eftest
and Kaocha
seem to be for clj only though. For cljs, "humane-test-output" seems to be the only one that's easy to add. It's prints errors like this:
It's not colorized, but still makes the test output much easier to read and understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The readme file of contains information on how to avoid changing the source code.
https://github.com/pjstadig/humane-test-output#clojure
I am using IDEA and I don't need that library. What I need is the default output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already tried the instructions, but they do not work for cljs.
'escape-html util/escape-html | ||
|
||
'empty? empty? | ||
'not not}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same remark, I would prefer that you let me know that you need some functionalities instead of added them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I don't quite understand how you propose to carry on with development in this project.
In the course of developing the parser, I noticed that a function is needed, so I added it in this isolated branch. No one else is impacted by the change.
These are functions that many parser are likely to need, so it makes sense to add it to the renderer itself, making it available to all parsers.
The nature of a pull request is to propose and request code changes. So this is a valid channel for making requests known. An implementation is also already provided.
By adding the functions that I need, the moment they are needed, I could continue to carry on with development without having to wait for other people. If any changes were found to be objectionable, they could be rolled back or changed. The benefit here is that the rest of the work that depends on that change will have been fleshed out and working, instead of being stopped.
To me, all of the above sounds reasonable. What's the objection here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is nothing wrong about adding some code in our your branch to make it work.
The problem arises when you ask to merge it. If you want your code to be easy to be discussed, maybe amended and merged, a separated PR would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're already discussing the code here? If you want something amended, you can just leave a comment as usual?
This is just to get things started.
call user supplied functions (lambdas)partialsHopefully, we'll have something to look at before too long.
Closes: #9