Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: merge branch/statement/functionMap's together when merging two coverage reports #617

Merged
merged 6 commits into from Sep 13, 2021
Merged

fix: merge branch/statement/functionMap's together when merging two coverage reports #617

merged 6 commits into from Sep 13, 2021

Conversation

JustinChristensen
Copy link
Contributor

@JustinChristensen JustinChristensen commented Jul 3, 2021

@bcoe @coreyfarrell

This is an attempt to fix the file coverage merge problems described in this issue, and a few other open issues.
#322 (comment)

This hasn't been extensively tested, and I'm mainly just spitballing on what merge should look like.

This is a breaking change.

I should also mention that I don't know all of the invariants that should be assumed to hold prior to merging two file coverages. I'm hoping that someone with more background on this code might be able to weigh in.

@bcoe
Copy link
Member

bcoe commented Jul 12, 2021

@JustinChristensen thanks for digging into this... because of how finicky this problem is, I would like to be very careful, and would want to test with v8-to-istanbul as well.

Would it be possible to add tests that represent the scenario you outline in your comment?

@JustinChristensen
Copy link
Contributor Author

JustinChristensen commented Jul 12, 2021

@bcoe

I would say that it's maybe a little early to be talking about tests for this code just yet. I was hoping that someone with more background on this codebase might be able to first review it and do a first blush sanity check, and give some idea of whether or not my assessment of the problem was correct and whether this seems like it's on the right track in being a fix for it (and whether there might be other places that have a similar problem that also need fixing).

This is finicky, but it's also trivial to verify that a mismatch in the number of properties between an item's "stats" and "meta" would lead code like

const metaArray = branchMeta[branchName].locations;
to error (when there is more "stats" properties than there is "meta" properties). The question is whether this is the only such case where these pairs of properties might get out of sync.

Let me know how you want to proceed.

@JustinChristensen
Copy link
Contributor Author

@bcoe Also, it might not give you as much confidence as an actual unit test, but I did verify this with the FileCoverage objects I mentioned in that comment, and more broadly with all of the coverage objects in my projects' test suite, and this seemed to work just fine.

@JustinChristensen
Copy link
Contributor Author

Some potential areas of concern:

  1. Is the start and end range on the loc property of branches and functions, and on statements themselves, enough to uniquely identify these objects while taking the union of the two sets?
  2. Will there always be the loc property or both start and end? Does this codebase enforce that invariant?
  3. What's the point of all?

Also, I just tried running the tests, and I see that the eslint configuration for this project doesn't like my style. Fixing that now.

@JustinChristensen
Copy link
Contributor Author

JustinChristensen commented Jul 12, 2021

@bcoe

Heh, the way the unit tests are written indicate that the properties in brancMap won't have a loc property, and yet the branchMap properties from v8-to-istanbul do (as you can see in the FileCoverage objects I shared in that gist in the comment).

So I think I have my answer for 2 in the above list.

@JustinChristensen
Copy link
Contributor Author

I added all back in, because there's a test that expected it.

Another test was expecting "map" properties where the keys started at 1 instead of 0 for some arbitrary reason. Instead of making the code keep track of those keys during the merge, I opted just to shift the keys over in the test, as I can't imagine why it would need to be that way.

@bcoe
Copy link
Member

bcoe commented Jul 14, 2021

@JustinChristensen perhaps we could ask the folks in #322 to provide a repo with a minimum reproduction of the merging issue? This would make it easier for me to understand the problem, also I'd like to test against c8 as well.

@JustinChristensen
Copy link
Contributor Author

@bcoe Oh, I see, you were asking for tests to help explain the issue. I thought we had already sufficiently done that in prose, but if you're looking for some additional assistance I can work on putting together an example for you.

@JustinChristensen
Copy link
Contributor Author

JustinChristensen commented Jul 14, 2021

@bcoe

Alternatively you can clone https://github.com/JustinChristensen/react-sample-app, and run yarn start and then in another tab yarn e2e, and then open ./coverage/src/components/EmployeeFields.jsx.html in your browser.

@bcoe
Copy link
Member

bcoe commented Jul 14, 2021

@JustinChristensen I'm unlikely to have cycles to work on this in the immediate term. As long as we're testing in linked against c8 and v8-to-istanbul as well, I'd be pretty confident with changes.

@linliTian
Copy link

any updates of this PR? Is there any timeline it will merge ?

@bcoe
Copy link
Member

bcoe commented Sep 7, 2021

@JustinChristensen, @linliTian apologies, unfortunately no timeline for merging this, I've had very little time to dedicate to open source recently, and this change is finicky enough that I'll need to dig into it deeply -- I'm sure that Justin has put great work in, but I need to do some manual testing in a variety of environments, since this is a non-trivial change to one of the more finicky parts of the codebase.

const keyFromLocProp = x => keyFromLoc(x.loc);
const keyFromLocationsProp = x => keyFromLoc(x.locations[0]);

[hits, map] = mergeProp(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One slight concern I have about this change is the number of additional operations that will be performed on each coverage report as we iterate over all these branch coverage operations.

Maybe there's no way around it, if this is necessary for correct merging behavior, but it would be good to find a large codebase to test against.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bcoe The previous incarnation looped through each of these properties, including the branch hit counters, for just the "other" object. This new algorithm also needs to iterate through the properties for "this" object too. So in the worst case I would expect this to be potentially twice as slow, but that's a low price to pay for correctness.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JustinChristensen I'm pretty sure on large codebases using c8, it won't ultimately merge many coverage files because the merge happens at prior to processing when the output is in v8 format.

I suppose the concern is a large codebase using tap and nyc, the npm CLI might be a good repo to test with -- if it works with npm, I'd be pretty confident we don't have too huge of a performance hit.

@JustinChristensen
Copy link
Contributor Author

JustinChristensen commented Sep 8, 2021

@bcoe Also, I yeah I figured you were busy. Are you the only maintainer for this project? I figured istanbul would be a little more active with how important of a role it plays in the JS ecosystem.

2: loc(2, 1, 2, 50),
3: loc(2, 51, 2, 100),
4: loc(2, 101, 3, 100)
0: loc(1, 1, 1, 100),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit odd to me that this new logic would have switched tests to being 0 offset vs., 1 offset.

Is this expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bcoe

The previous incarnation of merge preserved the object keys while zipping up the hit counters. As those keys are just ascending integers, and not a kind of identifier, it didn't seem particularly important to preserve them. Unless that behavior is documented somewhere, I suspect that the test was written with that unimportant implementation detail in mind, and not to validate an API's guarantees.

So my implementation just iterates through the hit counters in the order of Object.values, which I believe is now insertion order, for both the mappings and the hit counters. If that is now guaranteed to be insertion order then it should preserve the key correspondence between the mappings and the hit counters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I probably should check the spec on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good 👍

@bcoe
Copy link
Member

bcoe commented Sep 13, 2021

@JustinChristensen I put some work in this weekend to make releases of istanbulsjs/istanbuljs fully automated, so that it's easier to take patches like this, and to iterate quickly if we do find we bump into any hiccups, e.g., misunderstood behavior slightly and introduce a bug.

This gives me more confidence to work on landing a patch like this 👍 I appreciate having someone dive into this codebase, and I hope how long I've taken to land this PR doesn't discourage you from further contributions (help is very much needed and wanted.)

@bcoe bcoe changed the title File coverage merge WIP fix fix: merge branch/statement/functionMap's together when merging two coverage reports Sep 13, 2021
@bcoe bcoe merged commit ff1b5e9 into istanbuljs:master Sep 13, 2021
@bcoe
Copy link
Member

bcoe commented Sep 13, 2021

@JustinChristensen thank you for the contribution, next step is that I'd like to test on a few large real world projects before we release.

@JustinChristensen
Copy link
Contributor Author

@bcoe Sounds good. I don't maintain any large open source projects to test this on, and so you'll have to let me know how the testing goes. I'm happy to make any fixes that are needed based on your findings.

@dementors-magnus
Copy link

@bcoe Would it be possible to make an "@next" release to make it easier to test? We have a private repo with the same error as in #572 that I would like to check if this solves the issue.

@bcoe
Copy link
Member

bcoe commented Sep 23, 2021

@JustinChristensen thank you very much for the contribution 👏

I've tested so far on c8, yargs, and one of the libraries I work on at Google, behavior seems to be stable so far. Excited to see if this is an improvement for people.

@dementors-magnus let us know if this does the trick for you/

@dementors-magnus
Copy link

@bcoe Tested and from our perspective #572 is resolved with this fix! A big thank you to @JustinChristensen!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants