Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple lx -l json printer #118

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

jameysharp
Copy link
Contributor

I don't know what I'm doing but with enough copy-paste from other parts of the source tree I made a thing!

Copy link
Owner

@katef katef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestions:

  • Add json output for libfsm, not just lx. Then re(1) can use it as well as lx, and lx gets the optimisations from the codegen IR
  • Escaping for special characters in json strings
  • Consistency for naming (we could rename libfsm's terms if you prefer)

#include <adt/set.h>

#include <fsm/fsm.h>
#include <fsm/pred.h>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And <fsm/walk.h> for fsm_all()

print_zone(FILE *f, const struct ast *ast, const struct ast_zone *z)
{
struct fsm_state *s, *st;
int ret;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable


fprintf(f, "{\n");
fprintf(f, " \"prefix\": { \"api\": \"%s\", \"tok\": \"%s\", \"lx\": \"%s\" },\n",
prefix.api, prefix.tok, prefix.lx);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feels a bit strange to output the prefixes here - these are to affect the generated code, not to be part of it. But I know you're intending to use this json to go and generate code, so it does make sense to include them here.

fprintf(f, " \"next_zone\": %u,\n", zindexof(ast, m->to));
}

fprintf(f, " \"accepts\": true%s\n", e != NULL ? "," : "");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terminology: libfsm probably should have its .isend field renamed to "accept" at some point. I'd like the naming to be consistent.

Likewise for your "initial_zone" field, where that's called either the starting zone (when talking about the graph), or the global zone (when talking about scope).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the same for "initial_state"

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for "target", which libfsm currently calls "dst" or "to".

for (; e != NULL; e = set_next(&it)) {
struct set_iter jt;
for (st = set_first(e->sl, &jt); st != NULL; st = set_next(&jt)) {
fprintf(f, "%s\n { \"symbol\": %u, \"target\": %u }",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about these numeric symbols. Do you think a single-letter string would be more fitting?

assert(m != NULL);

if (m->token != NULL) {
fprintf(f, " \"token\": \"$%s\",\n", m->token->s);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These strings should be escaped, per src/print/json.c's json_escputc()

fprintf(f, " \"transitions\": [");
for (; e != NULL; e = set_next(&it)) {
struct set_iter jt;
for (st = set_first(e->sl, &jt); st != NULL; st = set_next(&jt)) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libfsm has an IR for code generation which provides a few optimisations for generated DFA. I'm not sure if you'd want to use it here. It provides different "strategies" for each state, where e.g. symbols are presented grouped as x..y ranges, or e.g. if all symbols transition to the same state. If you're going on to generate code from the json, I think it probably would make sense to make use of that.

To use that IR, I'd first add json output to libfsm (cut & pasted from the current C), which just walks through that IR and outputs what it's given. The IR was introduced here: #94

Then re -pl json xyz would give json for a single regexp, too.

@jameysharp
Copy link
Contributor Author

Thanks for reviewing! Your feedback makes sense. I'm working on something else right now so I'm not sure when I'd get back to this, but I probably will eventually... 😅 For the moment, though:

I didn't realize this, but I guess libfsm and re can already output JSON—and apparently in two styles, with or without IR optimizations, at that? You have a comment in libfsm/print/irjson.c saying "TODO: leaf callback for json output", which I assume is the main thing that's missing for lx to be able to use it as well. Is there more to it than that?

@katef
Copy link
Owner

katef commented Mar 18, 2019

So they do! Replace it with a better schema if you prefer.

The leaf callback is how lx renders things attached by its void *opaque - you can see that used for he generated C (to undertake zone changes) and for re's dot output (to print regexp indicies). It's a bit of an ugly arrangement, but it allows libfsm to print the graph, and callers to be responsible for rendering their own private concepts. I do think you'd want the same for json.

No worries about time! We're all busy with things. These APIs won't change, and I think this PR can wait as long as it needs.

@katef
Copy link
Owner

katef commented Jan 17, 2020

See #198 for new json for libfsm - this doesn't include the lx parts.

@katef katef changed the base branch from master to main June 14, 2020 00:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants