Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore from coverage more non-grammar statements in generated parsers #633

Open
brettz9 opened this issue Dec 9, 2019 · 11 comments · May be fixed by #632
Open

Ignore from coverage more non-grammar statements in generated parsers #633

brettz9 opened this issue Dec 9, 2019 · 11 comments · May be fixed by #632
Labels
Milestone

Comments

@brettz9
Copy link

brettz9 commented Dec 9, 2019

Issue type

  • Bug Report: no
  • Feature Request: yes
  • Question: no
  • Not an issue: no

Prerequisites

  • Can you reproduce the issue?: n/a
  • Did you search the repository issues?: yes
  • Did you check the forums?: yes
  • Did you perform a web search (google, yahoo, etc)?: yes

Description

I would like the generated parsers to have additional ignore statements to prevent noise when checking coverage against one's grammars.

Steps to Reproduce

  1. Use nyc against one's tests that use a pegjs parser

Example code:

n/a

Expected behavior:

I'd like to only see reporting on genuine missing code coverage, e.g., tests missing for token types which are represented in the grammar.

Actual behavior:

One sees many lines of code in the parser flagged as breaking coverage

Software

  • PEG.js: 0.10.0
  • Node.js: 10.16.3
  • NPM or Yarn: 6.13.2
  • Browser: n/a
  • OS: 10.15.1
  • Editor: Atom
@brettz9 brettz9 linked a pull request Dec 9, 2019 that will close this issue
@futagoza futagoza added this to the 0.12.0 milestone Dec 12, 2019
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Dec 17, 2019
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Dec 19, 2019
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Dec 25, 2019
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Dec 27, 2019
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Jan 1, 2020
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Jan 6, 2020
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Jan 6, 2020
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Jan 8, 2020
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Jan 10, 2020
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Jan 13, 2020
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Jan 17, 2020
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to jsdoctypeparser/jsdoctypeparser that referenced this issue Jan 17, 2020
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Jan 25, 2020
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Jan 25, 2020
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
brettz9 added a commit to brettz9/jsdoctypeparser that referenced this issue Feb 1, 2020
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
@StoneCypher
Copy link

Hi, I don't entirely understand this issue.

I also use peg.js with nyc. I simply set the parser as an exclude in .nycrc.

Are you trying to get actual coverage out of a generated parser? If so, why don't you just wrap calls to the parser, and get coverage of the wrappers?

@StoneCypher
Copy link

Being honest, I don't think code coverage metrics of a generated parser make sense

Most of a parser is about "this isn't the grammar"

A parser is machine-generated

Just test the grammar. You don't need coverage of the parser. You need coverage of the grammar.

The easiest way to get there is with a stochastic tester like fast-check.

Not only will you never actually get there by modifying peg, but even once you did, the coverage wouldn't actually be meaningful; peg enters rules it's not using to see if it can use them, meaning rules you haven't actually covered would incorrectly show coverage because they got visited by unrelated processes.

@brettz9
Copy link
Author

brettz9 commented Feb 3, 2020

@StoneCypher : Thank you for your thoughts and info on this.

Regarding much of the parse code being about "this isn't the grammar", that is why in #632 I have ignore statements generated on those parts which aren't about the grammar.

While yes, it is coverage of the grammar that we want, with proper coverage exposure of the parser, one should in theory be able to get full coverage of the grammar. Regarding peg entering rules it's not using, I believe it is generally enough to get coverage to check the branches which test positive and our PR in #632 is aimed at doing so. In performing such checks, that PR gave us an indication of coverage that was missing, helping ensure our project covered conditions it had not been, but without extra noise from the paths of paths not taken for a particular by the parser but where a rule ultimately gets covered elsewhere.

I'm not interested in mere calls to the parser, but as you say, the grammar. Thank you for bringing stochastic checkers to my attention and fast-check specifically. While I can see how this fuzzing approach could provide some extra sanity checking, I don't see how it could replace the benefits of getting coverage out-of-the-box when a grammar had already been built. I still imagine it would require a significant duplication of work.

It would be an interesting notion if one could hook into the parsing generation process, such that, instead of say supplying a fast-check fc.string() type, one could have the grammar generation process also auto-produce components for tests such as fc.tokenAlternative1ForRuleA(), fc.tokenAlternative2ForRuleA() where each component would generate some sample text meeting the token criteria, but I'm not sure of what application it would be to fuzz types which were already known to work (e.g., if a rule had [a-z]+, checking "a", "b", "foo", and "z" would seem to me of little use, unless it were for the auto-generation of tests to show what kinds of types pass).

Back to the topic of branches not being covered, to take a sample portion of one of our rules:

VariadicTypeExprOperand = UnionTypeExpr
                        / UnaryUnionTypeExpr
                        / RecordTypeExpr

...the generated portion looks like:

function peg$parseVariadicTypeExprOperand() {
  // ...

  s0 = peg$parseUnionTypeExpr();
      if (s0 === peg$FAILED) {
        s0 = peg$parseUnaryUnionTypeExpr();
        if (s0 === peg$FAILED) {
          s0 = peg$parseRecordTypeExpr();
          if (s0 === peg$FAILED) {

...for many projects, it would not be of interest to check that a specific else path of if (s0 === peg$FAILED) { was taken but merely that each rule was covered in full.

However, depending on how the parsed results were used, we might want to optionally avoid ignoring the else branches here if a project wanted to ensure e.g., that VariadicTypeExprOperand was satisfied in one test by a UnaryUnionTypeExpr type, in another test by a RecordTypeExpr, and in yet another by a UnionTypeExpr. The PR in #632 does not provide this option (it just ignores those elses), but maybe the option should be added.

@StoneCypher
Copy link

While yes, it is coverage of the grammar that we want, with proper coverage exposure of the parser, one should in theory be able to get full coverage of the grammar.

It's easy to get full coverage of the grammar.

The problem is that coverage of a grammar is completely meaningless.

Your response to my explanation seems to miss the point. I'll try again.

.

This is why this can't work

My largest grammar has around 800 things in it. I don't really know the formal way to say these things, so I kind of want to call them expressions, but I'm worried that's incorrect. It's around 3000 SLOC (5kLOC because I use a ton of whitespace.) It's right around the edge of my ability to cope right now.

The grammar I'm going to tell you about now is much smaller. (The larger one isn't public; this one is.) And I sort of apologize: this grammar is probably poorly written, and it'll probably offend someone who's actually good at parsers.

But also, it's 138 ... what do I call those, rules? 669 lines of code after you remove whitespace. Also it's the first non-trivial grammar I ever wrote. At the time I originally wrote it, this was brutally difficult for me.

The reason I know how to set up coverage for the parser is I already did it (by accident) on that grammar.

The reason I know how to exclude the parser from coverage is later I realized that the coverage, while correct, was dangerously misleading, and as a result, useless.

I wanted coverage in general for the standard reason: I don't trust my own code, and so I rely heavily on testing. The underlying library has 100% coverage with about 80x median pathing. I'm pretty proud of that.



Tooling changes

Except later, when I switched to typescript, the way I included the generated file had to change, because typescript sets the build tree up according to the furthest file up the filesystem chain (which makes sense, but now makes me wonder what it does across disks on windows)

The reason that matters is that prior to moving the machine to typescript/rollup, it had previously been flow and browserify (it's old, sue me,) and they set the root of the build according to the entrypoint, and exclude things above that from the build

Either is fine, but, I had given my users the same build tree for years and I didn't want to give that up in tool change, so, I chose to move the one generated file into the proper source tree, then delete it after the build. (I'm still not very comfortable with that decision)

And so, pow, suddenly nyc lights up with weird coverage numbers, and ... and oh, oh I see, I hadn't deleted the generated code until after the marking happened, and...

And that's when I decided I wanted coverage of my peg. If a rule existed, it was code, and it should be part of my 100%, right?



The results

And you know, that sounded great at first. And sure enough, I generated a bunch of test cases that happened to get into the peg's corners

The problem was that initially I didn't understand that it was getting into the peg's corners in ways that don't read correctly on a coverage test, and as such, my coverage though technically correct was observationally quite false

It's sort of important to remember that by the time coverage came into this grammar, there were already probably 70-80 rules, and it was already probably 400 SLOC

The host environment also already had more than a thousand test cases, so when coverage came up by accident it was already at a little over 50%

Also, it's NYC, so there's a little column at the right edge of the report that gives you three or four lines numbers per file which don't have coverage yet, so I kind of just used those as a guide. Add a test that kills the line numbers I see, re-run, get new line numbers.

I was able to achieve 100% coverage, I thought, in about two weeks.



The first shoe drops

Then I add some random feature. I don't even remember what. I think it might have been stripes.

The thing this grammar is parsing is a nightmare rebake of the basic DOT grammar, for what I originally claimed was uniformity. It's a shorthand for finite state machines.

And so you see sequences in it like

FirstState -> SecondState;
SecondState -> ThirdState;
ThirdState -> FourthState;
FourthState -> FirstState;

which could also be written

Summer -> Autumn -> Winter -> Spring -> Summer;

Or, they can be bi-directional:

solid <-> liquid <-> gas <-> plasma;

Or you can add "actions," which are (potentially shared) events which result in transitions.

On 'disconnect' -> Off 'connect' -> On;

Or, on both sides, for bidirectional links:

solid 'melt' <-> 'freeze' liquid 'boil' <-> 'condense' gas 'decohere' <-> 'cohere' plasma;

... or, those could share action titles, which might actually be useful ...

solid 'heat' <-> 'cool' liquid 'heat' <-> 'cool' gas 'heat' <-> 'cool' plasma;

And the grammar is sufficiently complex that the parser that's built from the currently 34k grammar is currently 1.4 meg, or 103k in small-but-slow (almost all of that is indentation, something I want to fix once the repo is ready for PRs again; I just fixed it locally yesterday once I found out about it)

And so you know, when I did this remaining 50% coverage by hand, I was pretty geeked, right? Because it's not just the situation as described above, but rather that written by someone who doesn't know what they're doing.

It Ain't Pretty ™



Your point?

Then I try to introduce stripes, which rewrites roughly the above

solid 'heat' <-> 'cool' liquid 'heat' <-> 'cool' gas 'heat' <-> 'cool' plasma;

into this shorter, more compact form:

-|1 <- 'cool' [solid liquid gas plasma] 'heat' -> +|1;

Might seem silly with an example that short, but as you get these really long stripes of states (hence the name,) they really shorten stuff. And then, the thing I was most excited about using them with immediately broke, namely named ordered sequences:

&states: [solid liquid gas plasma];
-|1 <- 'cool' &states 'heat' -> +|1;

Because that allows a form of appliable modularity on finite state machine states that is very similar to what Java calls a cross-cutting aspect, and which largely makes monstrosities like Heirarchal FSMs unnecessary.

To me, this is gold



Mmm hmm ok stop talking about your project

except lol I did implement that, and my parser supremo shits the bed extra-hard in the unit tests, even though the grammar is showing full coverage full passing

And so I'm Mister Krabs as fuck, right? Literally no idea what's going on.

image

And then I realize "oh wait, maybe my unit tests are causing coverage on the parser in ways that it shouldn't be showing coverage." And at that point I can't even think of a why yet

So I just move all my unit tests into the wrong directory and run coverage again

Now my parser coverage, which was at 100% from 51%, is at ... at ... 33%? what? 100 - max 51 isn't 49? where did the other 18% go? (Honestly I still haven't figured that out.)

Okay whatever fine shut up. So I just do the process again, climbing nyc's right column until I get clearance, right?

But I had already learned: those coverage numbers did not actually reflect the parser. They reflected any external use of the parser at all. And that's not actually good enough for parser coverage, because that means anything using that gets the line, no matter what's actually on the line.

So I changed my nyc setup to generate two sets of coverage - one for the parser on its lonesome, and another excluding the parser - and I set Travis (lol this was old) up to have threshholds on both of them.



And lo, brother: thine other shoe too shall drop, finally

And so after I get coverage up, I find the thing where the list doesn't do the stuff with the junk, and I reorder some hojombo, and suddenly -1 <|- &whatever -|> +1; finally works.

LEGIT.

So I'm going to implement the other thing I have like stripes, called cycles, right? The notation is pretty much the same thing, but they don't stop at the end, they wrap around.

So like, now you can do things like

&months: [jan feb mar apr may jun jul aug sep oct nov dec];
&months.{ shape: rect; corners: rounded; background-color: #ace; border-color: black; };

-1 <- 'prev' &months 'next' -> +1;

and actually some other funkier stuff too, like non-n steps; like if you play the card game Magic the Gathering,

&Colors = [White Blue Black Red Green];
-1 <- 'LeftAlly'  &Colors 'RightAlly'  -> +1;
-2 <- 'LeftEnemy' &Colors 'RightEnemy' -> +2;

Write out a grammar chunk-a-chunk, put it right where stripes are 'cause it's almost the same thing, and, off we go to the ra-- hey why is everything on fire

And I realize actually the way +N and -N are parsed out is interfering with the way +|N and -|N are (because this language generally doesn't have numbers, lol)

And now I'm confused and frustrated, because I just implemented the coverage on that stuff

Oh, no, though, I didn't. I went back, and I looked, because I had just earned 100% on coverage without unit tests, and th... the... wait where is it

And I'm looking, right? I'm finding the tests for + or - as the sign (ha) that this is happening, I'm seeing the tests for named sequential lists, I'm seeing the tests for whether the | is there or not, and ...

... and there's nothing for the actual number. When it wasn't a one, it was getting uptaken as a color RGBA hexadecimal. And the unit test faults that because that's horse hockey. (I map every feature with every non-range value and validate per-pairing that the right result comes out.)

But that should have come up in the grammar coverage too.

but but but I have one one one hun hun hundred percent cent cover cover coveahahaaaaaaaaaaa (crying)

No, dear fellow. You don't.

And so I actually traced the thing in a browser to figure out what was happening.

Oh.

Oh, I see.

To be a jerk, I usually add reversed test cases also. So, in addition to

[Summer Autumn Winter Spring] -> +1;

I will also add things like

[Spring Winter Autumn Summer] -> -1;

And that test caused the parser to try the path I thought it had tested, discard it as wrong, and move on.

But the coverage engine didn't know that PEG had discarded it as wrong.

In general, in coverage testing, the assumption is that if a line has been gone over, it's been used correctly, and may be counted as validated.

What I learned the hard way is that in anything involving back tracking and pattern matching at the same time, that assumption is not true.

All coverage shows you at that point is that a line is adjacent to a used line, and that's not actually valuable.

Instead, I have found great success in the alternate practice of generating primitives in a stochastic testing library (I like fast-check) and making sure that every rule has an expressing primitive.

@StoneCypher
Copy link

While I can see how this fuzzing approach could provide some extra sanity checking, I don't see how it could replace the benefits of getting coverage out-of-the-box when a grammar had already been built

Three ways.

  1. The results are correct
  2. It's actually a lot easier to extend a list of generators than a coverage table.
  3. Coverage is itself kind of a hack; even with branch detection, which nyc does not have, it's relatively easy to signal line coverage on things that should be treated as multiple expressions.

Having a stochastic generator is an incremental guarantee of total space coverage - not just of the primary space but also of every combination and expansion you can figure out how to express.

It's less work, it's more powerful, and the results are trustworthy

@StoneCypher
Copy link

function ArrayWrapOneNumberNeverThrow(n) {
  return (n === 4.2)? new TypeError('oh noes!') : n;
}

test( 'lol', expect(ArrayWrapOneNumberNeverThrow( 0 )).toBe( [0] ) );

nyc will say "that line has coverage and is tested." Many coverage testers, like the one in jest (which, amazingly, is istanbul, just like in nyc,) or the one in coveralls will realize that there's an unverified branch there, and complain.

If you annotate that with types, typescript can tell you that such and such a branch has type paths which might not come out the way you want. (This can be a false alarm, though, and often it's hard to write types that catch these, and often people typecast wrongly or use any or object and TS can't tell anymore.)

function ArrayWrapOneNumberNeverThrow(n: number): number[] {
  return (n === 4.2)? new TypeError('oh noes!') : n;
}

test( 'lol', expect(ArrayWrapOneNumberNeverThrow( 0 )).toBe( [0] ) );

TypeScript can tell you there's a problem there. However it doesn't know what it is. In this case it's obvious, but what about

function is_illegal(n: number): boolean {
  return (n % 5) * (n / log(n)) > (n/4);
}

function ArrayWrapOneNumberNeverThrow(n: number): number[] {
  return (is_illegal(n))? new TypeError('oh noes!') : n;
}

test( 'lol', expect(ArrayWrapOneNumberNeverThrow( 0 )).toBe( [0] ) );

Now you have to do some pretty heavy math to figure out whether that's an actual criterion, and whether that coverage matters, and how to trigger it.

When you're talking about combinations of factors, in parsers, this kind of situation is actually pretty common. Here, TypeScript would be able to say "this might be an issue," but not whether it actually is.

fastcheck will just sit there giving you hard examples

it's a whole different class of testing knowledge

These tools turn up life threatening defects in major-vendor car hardware that's been on the road for 20 years in days

It's not "extra sanity checking." It's a radical revision to what it's possible to test. It's moving from C to C++.

Give it a chance

@brettz9
Copy link
Author

brettz9 commented Feb 6, 2020

I believe I follow the bulk of what you are saying, but I think there is a misunderstanding here on the capabilities of nyc.

Here is a test repo showing that nyc does show a failure in coverage for the TypeError branch. Note also on https://brettz9.github.io/test-nyc-branching/coverage/ that Branches are listed at only 50% (though "statements" are at 100% given that the ternary conditional which includes one covered branch is a single statement).

Perhaps you had been using an older version of nyc? Such conditions are marked differently in nyc (in yellow rather than red), but they are still reported. Are you speaking about that? Anyways, it can report missing branch coverage, and that is what the #632 PR relies on for indicating failing coverage.

@StoneCypher
Copy link

"the coverage is not correct, here's why"

"but it can show coverage, maybe it was old nyc?"

i tried twice. oh well

@brettz9
Copy link
Author

brettz9 commented Feb 6, 2020

If you really want to prove your point, then give a concrete example in JS where it actually fails. Your JS examples do not fail in nyc.

@StoneCypher
Copy link

the entire point being that it looks like it's succeeding when it isn't

@brettz9
Copy link
Author

brettz9 commented Feb 6, 2020

Sorry, what I mean by "do not fail in nyc" is that they do not fail to show a lack of coverage. See my links.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants