New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed startCol/endCol vs startOffset/endOffset mismatch due to greediness and whitespaces #172
Conversation
#81 was caused mostly by: v8-to-istanbul/lib/v8-to-istanbul.js Lines 179 to 181 in 53c1cd8
because the v8 startOffset was 50 while the line startCol was 48 that's why I introduced minCol and maxCol which take into account the gaps caused by whitespaces The part I am not 100% sure of is the handling of lines that have only whitespaces (like 3 in the issue example had only 2 spaces). I added a check that minCol < maxCol to exclude them from the recording. |
I need to come up with a better solution to extend the ranges covering blank lines... |
@@ -2871,7 +3154,7 @@ Object { | |||
"37": 0, | |||
"38": 1, | |||
"39": 1, | |||
"4": 1, | |||
"4": 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work, I think we might just want to call this a breaking change when we land, but then take it without bumping the major of c8
, or other deps, this will allow each dependent library to update their tests.
@@ -279,8 +279,8 @@ Object { | |||
"line": 35, | |||
"loc": Object { | |||
"end": Object { | |||
"column": 0, | |||
"line": 37, | |||
"column": 29, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SimenB @bcoe sorry for the noise...this was a hard nut to crack. I reopened the PR now that I found a solution to the range mismatch problem that I think is effective. v8-to-istanbul/lib/v8-to-istanbul.js Lines 179 to 181 in 53c1cd8
so I ended up resorting to compare the ranges after having trimmed the white spaces which guarantees more reliable matches. I checked out C8 and linked it to my branch to see if something was getting broken in there and to my surprise I found I just fix some bugs in the coverage there.5 cases that are very similar: take this one for example: the expected range 27-28 was actually wrong because it was leaving out the method name. At this point I would like you to review my approach and if I have your blessing I intend to look into how to optimise it. |
@glromeo thank you for this contribution 👏 it might take me a little while to review, just because this library is complex and I need to page it into memory, in the hopes of not breaking anyone. |
@glromeo I think the best way to ultimatley test, would be to link against |
@bcoe I fixed the shebang snapshot and now all tests are ok (that slipped before because I was using node 16.x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realized I left a couple comments but didn't leave the review.
Thank you for this contribution, it's looking great. I'll need to do some thorough testing with c8
in a few codebases, just to make sure things are looking good.
Have you managed to do any real world testing with this branch?
let upper = lines.length - 1 | ||
|
||
let s | ||
while (lower <= upper) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate that you pulled this into a helper file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, I actually spent some time asking myself: is it better to do 2 binary searches or not? At the end I preferred to opt for only 1 binary search and then linear scan because it's unlikely that in a large file a range would be too big anyway so I guess that the linear search should perform better
@@ -2871,7 +3154,7 @@ Object { | |||
"37": 0, | |||
"38": 1, | |||
"39": 1, | |||
"4": 1, | |||
"4": 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work, I think we might just want to call this a breaking change when we land, but then take it without bumping the major of c8
, or other deps, this will allow each dependent library to update their tests.
@glromeo let's release a candidate release of this work which you can pull into |
thank you @bcoe
the |
@glromeo I usually test with the projects below for a major update:
If I'm feeling really ambitious, I'll test on the Node codebase, which uses c8 for coverage. I have Monday off, so can potentially do someof this testing, if you don't beat me to the punch. I'll also prioritize giving a thorough review of this PR. |
@glromeo I did some initial testing in https://github.com/bcoe/c8/blob/main/test/fixtures/all/ts-only/main.ts Indicates that only lines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ti might be worth splitting this into two PRs, one that introduces the optimized line gathering -- which we can then benchmark against the old approach (I think this will be a significant benefit for large codebases).
And then I think we might want to dig into the false postives separately. Source Maps are a pain in the neck because they themselves are a best guess of a remapping, so tiny changes to things like line endings can break them -- tldr, we'll need to be careful.
@@ -4,12 +4,14 @@ module.exports = class CovLine { | |||
// note that startCol and endCol are absolute positions | |||
// within a file, not relative to the line. | |||
this.startCol = startCol | |||
this.minCol = startCol + lineStr.length - lineStr.trimStart().length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trimming lines may actually be the cause of the remapping issues I saw in c8
, since we specifically see breakage with tests that exercise SourceMap files.
The source map lookup logic is quite finicky, and trimming before performing it may need to be avoided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok! I'll split the PR first and then I'll look into the breakages... thank you for pointing them out! |
Given the findings in #173 I better close this PR at this point and open a new one later on |
This PR aims at solve #170 and solve #159.
I implemented the kind of binary search that @bcoe suggested in #159 and I tweaked the range check because the
I noticed that the = in
v8-to-istanbul/lib/source.js
Lines 78 to 80 in 53c1cd8
seems to be what was causing #170 because tends to overflow on unmapped lines, the range is considered invalid and as a result there are false positives in coverage.
I am keen to investigate and fix also #81 as part of this