New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature(fix): function parameter tracking #5483
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5483 +/- ##
=======================================
Coverage 98.81% 98.81%
=======================================
Files 238 238
Lines 9451 9540 +89
Branches 2408 2436 +28
=======================================
+ Hits 9339 9427 +88
- Misses 47 48 +1
Partials 65 65 ☔ View full report in Codecov by Sentry. |
Not sure I fully get the underlying connections, but at first glance I feel very uneasy about reverting includes. It has been a reliable behavior for tree-shaking that things only ever get included, and it could even make the algorithm non-deterministic if we change that. In the worst case, we include something unnecessarily, but I would rather have that than have the algorithm not reliably converge. |
Wow, yes, the real problem here is not don't include anything unnecessary, but make sure the children include has the latest state. I'll rethink it this way. |
I think the problem is that branch resolution should never go from analyzed to not-analyzed but rather from analyzed to unoptimized as it used to be. That ensures that it stays deterministic. We CAN re-analyze the resolution, but if the result is that then the other branch is picked, we should instead go for not-analyzed IMO. Not sure I will manage to do a rollback now, but maybe it is ok to leave the issue open for a day. |
I think the problem is not restricted to picking the other branch, a short example (still I have not reproduced, so only an example): function foo(l, r) {
// l and r may be a complex expression
console.log(l && r)
}
foo(l, r) If in the first pass, l and r are both unknown, both l and r will be included. However, the value of r is unknown does not mean nothing happens inside, there is probably something happens in r, and request a tree-shaking to update I force pushed a commit, to keep the idea with "include", make sure if one branch is included, then it will always be included in next tree-shakings. Also I think it may be fine to update Still this fix may not be complete, I'm checking if there are other situations a node "included" but |
I'm still not super happy about this. The idea that an unused branch could be included seems wrong to me. Why not go back to the original implementation for now where something that is unknown at one point remains unknown? We would still keep most of the function parameter optimization but this part would be on more solid footing? |
Looking at your example, why is the function call not known during the first pass when the function is included? After all, it is only included because of the function call. |
I believe it is a cache problem, the function gets included because of the function call but before that it seems it has an cached value of unoptimized. (expression cache test (which fails if I don't set One other test with a conditional expression will fail too, but yes, we still have most of the optimizations, and don't include unused branch is reasonable, so I'll remove EDIT: I just found Reason: For Consider: include () {
// in the first include we clear the branch cache
if (!this.included) this.testValue = unset
// original code ...
} apply this change to |
Maybe we just revert the change to IfStatement for now as well, i.e. deoptimizeCache(): void {
this.testValue = UnknownValue;
} as it was before, and we can then think about a better long-term approach after this is released? |
It almost discards all of the parameter optimizations, so maybe we revert main branch to 4.15 code, and move all commits after that to a new development branch? |
I created #5487 to revert for now. While this is not as nice as rewinding master back to an earlier version, it ensures that master is kept in a releasable state while maintaining the release history and avoiding force pushes. |
Revert "Revert function parameter tracking logic for now (rollup#5487)" This reverts commit 74da135.
So I just reverted #5487 in this PR, we can collect and fix bugs for function parameters tracking here. Some known issues:
|
Ok, I have some thoughts about the problems we were facing. As you correctly saw, the function parameters are collected far too late. The problem is not the collection, though: This happens on Here is a proposal how this can be fixed:
Beyond that, any functions like As the CallExpression should be traversed before the function body is traversed, the call values should usually be present. However, if we overlook anything and the ParameterVariable is queried before it has a call value (or before we know there is a usage that we do not track), we could just set the value to Unknown and keep it like that. But I do not expect this to happen. What do you think? |
Yes I thought of There are one problem and some thoughs I'd like to share now: One problem:In the first call of a function, everything starts from hasEffects(context: HasEffectsContext): boolean {
try {
for (const argument of this.arguments) {
if (argument.hasEffects(context)) return true;
}
if (this.annotationPure) {
return false;
}
return (
this.callee.hasEffects(context) ||
this.callee.hasEffectsOnInteractionAtPath(EMPTY_PATH, this.interaction, context)
);
} finally {
if (!this.deoptimized) this.applyDeoptimizations();
}
}
So I wonder (sorry for not understanding the logic here well_:
Some thoughtsOnly known and unknown states are neededIn traditional constant propagation optimization, we must init a top state for every variables. This is because we have no idea of what constant value a variable can be at the beginning. However things have changed for tree-shaking, as tree-shaking only includes nodes needed, it naturally generates a "call tree/graph", so we actually only need two state: from known to unknown, which is similar to other parts of tree-shaking, really cool. See code below: function foo1(a) {
// ...
}
foo1(0)
foo2(0)
function foo2(a) {
// ...
} We are pretty sure the parameter value is 1 or unknown, from the first "include". And in multi-level function call like code below: function foo1(a) {
// a is 1 here
foo2(a)
}
function foo2(a) {
// ...
}
foo1(1)
Update states in every include?In previous implemention, I update function parameters in every include to make sure converge. However, it seems we don't have to do so. Making an Data-flow analysis optimization converge usually has two ways:
I found |
So I update my code:
The It's interesting that we have a big refactor which reduces 60 lines of code because of bugs😀 |
As code gets smaller, I can take more time to see where the problems might be and I found one. function foo(a) {
arguments[0] = 0
eval("arguments[0] = 0")
if (a) {
console.log(1)
} else {
console.log(0)
}
}
foo(1) While I assumed functions parameters/arguments to be static, It seems not. Luckily it's the only thing I found break the rule. From my perspective, we need to make EDIT: I found in strict mode there is "No syncing between parameters and arguments indices", so this should not be a problem. |
Honestly, I am not sure, I guess the order just did not matter. Try switching the order around, if no test fails it should be fine.
Interesting that you mention this. In the far future, I would envision Rollup to transfer most of the tree-shaking logic to what you call a work-list algorithm. So one would only have a single full pass and then maintain a list of "dirty" nodes that need to be re-evaluated. For myself, I always called this a "reactive" algorithm, but work-list probably fits better. If done right, this should improve performance quite a bit. However, it would require to know all dependencies of all nodes and this feels quite complicated to implement. I think I will have a look at your changes by tomorrow.
|
Ah but you are right, reassignments of parameters themselves should not be tracked in |
No test fails, I find #4148 related. I will have a look today.
Really cool, in LLVM IR it's easy because it is trivial to build a use-def chain for
I'm just a little worried because I'm not familiar with how people would use rollup, and maybe there are some legacy code depends on this behavior. The fix is relatively easy, though. export default class FunctionNode extends FunctionBase {
super.include(context, includeChildrenRecursively);
this.id?.include();
const hasArguments = this.scope.argumentsVariable.included;
+ if (hasArguments) {
+ this.deoptimizeParameters();
+ }
// ...
} |
style: move deoptimizeCache logic style: naming style: naming
style: move deoptimizeCache logic style: naming style: naming remove useless comment style: small update
This is to make sure if onlyFunctionCallUsed becomes false, there is a new tree-shaking pass to deoptimize it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks quite good now. I wonder if we should try another release :)
Thanks, I think it's definitely worth trying. |
This PR has been released as part of rollup@4.17.0. You can test it via |
Apologies for not providing a concise bug report in the form of a new issue, but I can’t really share the actual reproducible code (private repository). If I find some time, I’ll try working on a proper bug report (edit: someone else one: #5502). I’m pretty sure 4.17.0 has the same regression I’ve seen in rollup >=4.16.0 <4.16.4. The following kind of code (I obfuscated the real code but kept, hopefully, the syntax and logic sufficiently accurately): import { exportA, exportB } from 'package'
export function fn(a, b, c) {
return async (d, e) => {
const { result } = await exportB(d)
if (!b.prop1?.prop2?.prop3) {
c.error = new Error('', { cause: { prop: a.fn('string') } })
c.dispatchEvent(new ErrorEvent('error', { error: c.error }))
return true
}
if (!result) {
if (!exportA) {
c.error = new Error('', { cause: { prop: a.fn('string') } })
c.dispatchEvent(new ErrorEvent('error', { error: c.error }))
}
return true
}
}
} becomes, functionally, in the output: import { exportA, exportB } from 'package'
export function fn(a, b, c) {
return async (d, e) => {
await exportB(d)
c.error = new Error('', { cause: { prop: a.fn('string') } })
c.dispatchEvent(new ErrorEvent('error', { error: c.error }))
return true
}
} I think the second |
If you want to help us track this down, we need to know what all places where the function
For some reason, it can neither track the call nor can I detect that it "lost track" of the function. It could also be of interest if the function is passed to other functions, assigned to object properties or global variables etc. |
Of course, but first, let me start by saying that 4.17.2 (issue, PR) fixes the issue I describe in #5483 (comment).
prop: (...args) => {
return [fn(...args)]
},
In the file that contains the sole call of Looking at https://github.com/rollup/rollup/pull/5503/files, it seems my example might be exactly what’s fixed in that PR. If you think there could be more to this, I’d be happy to provide more information of course. |
Thanks for taking the time to respond, this gives me some confidence that @liuly0322 got to the root of the problem here. Lets see if anything else pops up in the next days, but so far it is looking quite good! |
This PR contains:
Are tests included?
Breaking Changes?
List any relevant issue numbers:
Description
This PR is to point out a problem with 4.16, and discuss if we should revert it first if the work to do is heavy.
Unused node should not be included, otherwise:
In the first visit, assuming no optimization info for node1, so node2 and node5 will both get included. and node6 may get not included.
However, if in the second visit, node5 is excluded, becaused it is still mark
included
, in render stage renderer will try to render node5 and find node6 or node 7 error.This PR can fix #5480. (no tests because the condition is a bit harsh and I have not reproduced a smaller one)
While it is fixable, It actually reveals: we need a way to revert
include
. This can be done in two ways to make sure no problems:included=false
in everyinclude
function that may only include some children (ordeoptimizePath
like in the PR, it actually makes no difference becausedeoptimizeCache
will request oneinclude
, but place it ininclude
may be better because there may be other functions causing children included change (in the future?) and request ainclude
)included
to check, but only usesusedBranch
or other stuffs like that. I guess it's not how it is designed.This PR only achives a part of the first way. I'll be off for next 12 or more hours, so if it's too heavy to do, maybe we should revert version 4.16 ...