Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chain of never resolved promises create memory leaks #179

Open
jonathanong opened this issue Dec 26, 2014 · 61 comments
Open

chain of never resolved promises create memory leaks #179

jonathanong opened this issue Dec 26, 2014 · 61 comments

Comments

@jonathanong
Copy link

is this the right place to ask this?

tj/co#180 (comment)

@ForbesLindesay
Copy link
Member

This is part of the reason for having 2.3.2 in the Promise Resolution Procedure:

If x is a promise, adopt its state

This is separate from the specification of the resolution procedure for thenables. Because it doesn't specify how that state is adopted, we allow the promise to use some internal means to unravel the stack (providing all promises come from the same implementation). This probably hasn't been implemented in practice though, and the spec does not explicitly state that a reference should not be held.

I'll have a go at testing and fixing this in then/promise and post back here to let you know how that goes.

@ForbesLindesay
Copy link
Member

I have made something that I think fixes the memory leak in then/promise (then/promise#67) It would be great if you could give it a more thorough testing to see when/if it fails. I'm particularly concerned that it might have created memory leaks where there was none before.

@bitinn
Copy link

bitinn commented Dec 27, 2014

We implement a job queue using this exact pattern, should we be worried? Seem like a mistake a lot of developer will make, eg. http://stackoverflow.com/questions/15027192/how-do-i-stop-memory-leaks-with-recursive-javascript-promises

Please let me know if I missed the point...

@ForbesLindesay
Copy link
Member

@bitinn yes, unfortunately that will leak memory (until the promise chain unravells) with most current promise implementations. It should work fine with the fix linked to above (using then/promise rather than Q).

The solution is very similar to the concept of tail recursion in synchronous code.

petkaantonov added a commit to petkaantonov/bluebird that referenced this issue Dec 28, 2014
Fixes the memory described at
promises-aplus/promises-spec#179
@bitinn
Copy link

bitinn commented Dec 28, 2014

@ForbesLindesay

seem like if I do this it won't leak (return p instead of p2 will leak), is this a good approach or still implementation dependent?

var i = 0;

function run() {
    var p = new Promise(function(resolve) {
        if (i % 10000 === 0) {
            global.gc();
            console.log(process.memoryUsage());
        }
        i++;
        setImmediate(resolve);
    }).then(run);

    var p2 = new Promise(function() {
        return p;
    });

    return p2;
}

run();

@petkaantonov
Copy link
Contributor

@bitinn much simpler would just be leaving out the return if you don't need it

var i = 0;

function run() {
    new Promise(function(resolve) {
        if (i % 10000 === 0) {
            global.gc();
            console.log(process.memoryUsage());
        }
        i++;
        setImmediate(resolve);
    }).then(run);
}

run();

@bitinn
Copy link

bitinn commented Dec 28, 2014

@petkaantonov got it, cheers, we kinda need it to return promise as we called it with this.run().catch(...) but I can see it's not required for recursion.

@ForbesLindesay
Copy link
Member

@bitinn

var p2 = new Promise(function() {
  return p;
});

is equivalent to:

var p2 = new Promise(function() {
});

because the return value of the factory function is ignored. You probably meant to do:

var p2 = new Promise(function(resolve) {
  resolve(p);
});

which would have still leaked.

For the original example of doing something asynchronous continuously:

function run() {
  return doAsyncWork().then(run);
}

run();

you could fix the memory leak in all promise implementations by breaking the chain of promises:

function run() {
  return new Promise(function (resolve, reject) {
    function recurse() {
      doAsyncWork().then(recurse, reject);
    }
    recurse();
  });
}

run();

It's a little dirty, but by having an inner recurse function that does not return a promise, I am able to break the chain each time the function recurses. To make sue each recursion also handles errors by passing them to reject I maintain the same error handling behaviour as the original.

@ForbesLindesay
Copy link
Member

An interesting question, should we specify that promises are not permitted to leak memory in this use case? What would such a specification look like.

@petkaantonov
Copy link
Contributor

Well the whole 2.3.2 needs to be redone.

As far as I can tell we both fixed the memory leak in our implementations by turning promise into a proxy for x: any operation performed on promise will be redirected to x. Any operation already performed on promise are immediately redirected to x (which is possible because both must still be pending at this point).

This is different from what the spec currently says, promise is now never pending, fulfilled or rejected, it doesn't have its own state at all. If it had, that means x would need to inform promise when it changed state so that promise can change its state and that means x holding reference to promise which leads to the original leak.

In the case x is already fulfilled or rejected then it's simply same case as if the original handler returned direct value or threw direct error.

@stefanpenner
Copy link

An interesting question, should we specify that promises are not permitted to leak memory in this use case?

The spec as written today does allow implementors to avoid this leak. This leads me to believe that it may not be a spec concern rather the concern of implementors. I suppose one could argue that if many implementations exhibit this behavior, the spec wording may need to be refined. That being said, this feels like the responsibility of a supplementary compliance test suite.

@ForbesLindesay
Copy link
Member

@stefanpenner As suggested by @petkaantonov, we haven't really done what the spec says to do. In order to fix the leak we had to make promise into a proxy for x, which is not quite the same thing as making promise adopt the state of x. The difference is not observable to the outside world, but we aren't really complying with what the spec says to do.

@stefanpenner
Copy link

The difference is not observable to the outside world

I believe the spec intends to describe the observable behavior, not implementation.

I do believe the wording can be improved, but I am nervous of the spec dictating implementation details.

@petkaantonov
Copy link
Contributor

What I just specified does change observable behavior

var resolveA;
var a = new Promise(function() {
    resolveA = arguments[0];
});

a.then(function() {
    console.log("first");
});

var resolveB;
var b = new Promise(function() {
    resolveB = arguments[0];
});

b.then(function() {
    console.log("second");
});

resolveA(b);

b.then(function() {
    console.log("third");
});

resolveB();

@petkaantonov
Copy link
Contributor

Bluebird before the fix (and e.g. Q, both leak) will log second, third, first

While after the fix it logs second, first, third.

@stefanpenner
Copy link

@petkaantonov ah, since during followers merge some ordering information is lost.

@stefanpenner
Copy link

i wonder if the 2.3.2 adopt state is more accurately followed post fix.

@stefanpenner
Copy link

I suspect the solution is to refine and elaborate on the "adoption" algorithm

@stefanpenner
Copy link

i wonder if the 2.3.2 adopt state is more accurately followed post fix.

This may by true, but now their is an observable difference between situations were all promises are ownPromises, vs a mixture of foreign and ownPromises. This seems quite bad.

@petkaantonov
Copy link
Contributor

Yes you will get different order if you do this:

var Promise = require("bluebird");
var Thenable = require("rsvp").Promise;

var resolveA;
var a = new Promise(function() {
    resolveA = arguments[0];
});

a.then(function() {
    console.log("first");
});

var resolveB;
var b = new Thenable(function() {
    resolveB = arguments[0];
});

b.then(function() {
    console.log("second");
});

resolveA(b);

b.then(function() {
    console.log("third");
});

resolveB();

However, 2 of the .thens are called on a Thenable, there is no reason to expect them to be in any specific order

@stefanpenner
Copy link

However, 2 of the .thens are called on a Thenable, there is no reason to expect them to be in any specific order

Up until this discussion I would have expected the interaction between 2 spec compliant implementations to not be observably different from the same interactions happening between promises of 1 implementation.

If it wasn't for the observable differences when mixing 2 implementations, I would prefer the ordering of the "fixed" adoption strategy. Unfortunately with this difference, I am now unsure.


Illustrating the differences [https://gist.github.com/stefanpenner/10afbee137180c2e7862]

rsvp (pre fix) x bluebird (post fix)

rsvp, rsvp -> [ second third first ]
rsvp,   bb -> [ second third first ]
bb,   rsvp -> [ second third first ]
bb,     bb -> [ second first third ]

rsvp (post fix) x bluebird (post fix)

rsvp, rsvp -> [ second first third ]
rsvp,   bb -> [ second third first ]
bb,   rsvp -> [ second third first ]
bb,     bb -> [ second first third ]

rsvp (pre fix) x native

rsvp,      rsvp -> [ second third first ]
rsvp,    native -> [ second third first ]
native,    rsvp -> [ second third first ]
natuve,  native -> [ second third first ]

rsvp(pre fix) x then/promise (post fix)

rsvp, rsvp -> [  second third first ]
rsvp, then -> [  second third first ]
then, rsvp -> [  second third first ]
then, then -> [  second first third ]

rsvp(pre fix) x then/promise (pre fix)

rsvp, rsvp -> [  second third first ]
rsvp, then -> [  second third first ]
then, rsvp -> [  second third first ]
then, then -> [  second third first ]

RSVP (pre fix) X when

rsvp, rsvp -> [  second third first ]
rsvp, when -> [  second third first ]
when, rsvp -> [  second third first ]
when, when -> [  first second third ]

[edit: added when]

RSVP (pre fix) X when (master)

rsvp, rsvp -> [  second third first ]
rsvp, when -> [  second third first ]
when, rsvp -> [  second third first ]
when, when -> [  second third first ]

[edit added: when#master]

@Arnavion
Copy link

If one treats all promises as "foreign" (i.e., ignore 2.3.2 entirely and only use 2.3.3.3) then the order is second-third-first, if I'm not mistaken. Should that be considered the canonical order if 2.3.2 is going to be amended even for implementations that implement 2.3.2?

@briancavalier
Copy link
Member

second, third, first seems like a reasonable order. On the other hand, is there any real relationship between the handlers added to A and those added to B? If so, it seems like it could be very hard to visualize when promises are spread out. Would it be better just to say that the order is undefined? IOW call it a mistake to write code that depends on it, and instead, devs should explicitly use then to guarantee the order they need between promises?

Conceptually, it seems reasonable to say that since A is resolved to B (making A and B effectively indistinguishable), resolving B causes A and B to become fulfilled simultaneously. IOW, there is no time when B is fulfilled and A is not, and vice versa. Then, first, second, third actually makes some sense. That seems like it could be hard to guarantee when mixing promise impls, though.

@stefanpenner
Copy link

second third first seems like the only one we can guarantee when mixing implementations.

It is great to see when can achieve both reasonably here.

@Arnavion
Copy link

In that case, should 2.3.2 just be dropped from the spec, along with the concept of own-vs-foreign promises? Let all promises be handled by the generic thenable case? Implementations can still implement 2.3.2 on their own as an optimization, of course.

Also, does this affect Promise.resolve() in any way? Is it okay for Promise.resolve(v) to return v sometimes and new Promise(resolve => resolve(v)) other times? Or should it always be made to return the latter?

@ForbesLindesay
Copy link
Member

@Arnavion The point is that if you drop 2.3.2, implementations would not be allowed to implement that optimisation while adhering to the spec. The spec would then be forcing them to implement it in the way that leaks memory.

Also, we should probably specify that implementations should not leak memory in this case (which requires the optimisation).

Promise.resolve is not actually specified by the Promises/A+ spec. I can't remember what ES6 states off the top of my head, but I think in general there aren't any problems with it just returning v when v instanceof Promise

@Arnavion
Copy link

@ForbesLindesay

I don't think implementations would be prohibited from implementing 2.3.2 just because it's not in the spec. They only need to implement it in such a way that it gives the same result as 2.3.3.3 would have. The spec shouldn't be expected to be translated line-by-line to code.

Of course, we can also just keep 2.3.2 and modify it to read something like that instead. Eg. "If x is a promise, then adopt its state in such a way that it gives the same results as 2.3.3.3"

@ForbesLindesay
Copy link
Member

I think the best way to think of this is as being a request for tail recursion on promises. i.e.

function run(i) {
  return i < 99999999 ? run(i + 1) : i;
}
console.log(run(0));

would run out of stack and cause an exception. Similarly:

function run(i) {
    return  new Promise(function(resolve) {
        setImmediate(resolve);
    }).then(function () {
      return i < 99999999 ? run(i + 1) : i;
    });
}
run(0).then(function (result) {
  console.log(result);
});

Maintains an ever increasing stack of promises, resulting in an out of memory error.

@ForbesLindesay
Copy link
Member

One issue, in the case that a adopts the state of b, there are two options (and only two as far as I'm aware):

  1. a moves all it's pending handlers to b, then a maintains a reference to b, all future calls to a.then attach their handlers directly to b.
  2. a calls b.then(fulfillA, rejectA), which causes b to maintain a reference to a and fulfill/reject it as appropriate later. This is what the existing ES6 spec does.

The third option is to make use of weak-references in order to squash the chain in both directions. It works something like this:

  1. a moves all it's pending handlers to b
  2. a maintains a reference to b
  3. all future calls to a.then attach their handlers directly to b.
  4. b maintains a weak reference to a
  5. when b adopts the state of c it messages a, which replaces its reference to b with a reference to c.

At the end of that process, the head and tail (a and c) may be explicitly referenced by user code, without any code keeping a reference to b.

@ForbesLindesay
Copy link
Member

As for test case, The only way I've figured out of testing that this is working is to profile the memory as in https://github.com/then/promise/blob/master/test/memory-leak.js

The ordering change is that for:

var A = new Promise();
var B = new Promise();
RESOLVE(B, A);
B.then(function () {
  console.log('B');
});
A.then(function () {
  console.log('A');
});
RESOLVE(A, null);

would log B, A instead of A, B unless you do some clever workaround.

@bergus
Copy link

bergus commented Apr 15, 2015

@domenic Let me try to put this down:

Nothing needs to change in the current Promises/A+ spec. There might be amendments to make, of varying strength, to the "adopt state" step, as point 2.3.2.4:

  • "When xis fulfilled, the onFulfilled callbacks of promise must be executed after the onFulfilled callbacks of x that were installed when the adoption happened."
  • "When xis fulfilled, the onFulfilled callbacks of promise must be executed after all the onFulfilled callbacks of x (at the time of the fulfillment)."

(and respectively for rejection and onRejected callbacks, point 2.3.2.5).

The first specifies basically what would happen when a thenable would be assimilated, where a then call on x must execute the resolvePromise/rejectPromise callback that fulfills/rejects promise only after callbacks from previous x.then calls.
The second strengthens this to include further x.then() calls after the adoption. It basically describes the behaviour of thenable assimilation if we assume that resolvePromise/rejectPromise are asynchronously calling callbacks and callbacks are executed in a synchronous sequence (not given!). I would recommend to avoid this, as it prevents the "straightforward proxy implementation" (as Petka called it), and leaks memory for a deeply recursive adoption (a adopts b which adopts c which adopts d…), or at least makes implementation unnecessary complicated (haven't completed my thoughts on this yet).

We further might want to nail down the behaviour (callback ordering) when x adopted multiple times, by promise1 and then by promise2. Should promise1.then() callbacks be executed before promise2.then() callbacks?

A test case could be derived from the examples given above (without the "third" callback).

Going to look at the ES6 spec now.

@bergus
Copy link

bergus commented Apr 15, 2015

Ah thanks at @ForbesLindesay I obviously hadn't seen your post. Yes, that's exactly what we want to get. This third option with the "weak references" is just what I have implemented in my own Promise library. And indeed, my library does do B A, and to change that it would require some heavy workaround.

@petkaantonov
Copy link
Contributor

@bergus how do you do weak references? there is just a WeakMap. Are you using that?

@bergus
Copy link

bergus commented Apr 15, 2015

@petkaantonov no, I use an "indirect reference". Promises that adopt each other share a common handle that contains a reference to innermost adopted promise (somewhere around https://github.com/bergus/F-Promise/blob/b66e917e2bf8836dab824875e77170f088e8361c/Promise.js#L151). This handle is as only reference to b, and when b adopts c it changes the handle's .resolution to c without having a reference to a.

@bergus
Copy link

bergus commented Apr 15, 2015

@domenic

What should change in the ES6 promises spec

I don't think anything specific needs to change. Except, if we wanted to spec tail-adoption-optimisation, probably everything would need to change (promise capabilities getting resolved lazily etc).

@domenic
Copy link
Member

domenic commented Apr 15, 2015

I don't think anything specific needs to change. Except, if we wanted to spec tail-adoption-optimisation, probably everything would need to change (promise capabilities getting resolved lazily etc).

That basically gives me no new information. Do we need to spec tail-adoption-optimization to avoid the memory leak? If so what are the observable consequences---just ordering changes?

@bergus
Copy link

bergus commented Apr 15, 2015

Do we need to spec tail-adoption-optimization to avoid the memory leak?

Need? Probably not. I think the spec does already allow optimised implementations. Do we want to specify it? Maybe, I can't tell. That's a design decision, just as optimised tail recursion was.

If so what are the observable consequences---just ordering changes?

Yes, if ordering was changed this could simplify a few things. I'm not sure whether it is required at all to avoid the memory leak, but I guess it would need some very clever implementation strategies to avoid the memory leak and guarantee the ordering currently specced by the PromiseJobs.

If this was changed, a possibly different ordering of callbacks would be the only observable difference (and your app not breaking, of course), the rest of the user-visible methods would behave the same.

If someone subclassed promises however, this might break the optimisation. Not sure whether we can have both.

@domenic
Copy link
Member

domenic commented Apr 15, 2015

OK, so my summary of where we are now with regard to ES6 is that people think it might be possible to maintain the current observable semantics with no spec changes and also avoid memory leaks, but they are unsure, and think that it would probably be easier to avoid memory leaks if we changed the ordering.

@ForbesLindesay
Copy link
Member

That seems like a reasonable summary. At the moment, A+ doesn't even specify this particular ordering. I'm not sure about ES6 though.

I do think we should find a way to specify that this optimisation must be performed, because code that assumes it's implemented will break in really hard to debug ways if the implementation doesn't support it.

@bergus
Copy link

bergus commented Apr 16, 2015

@domenic OK I reviewed this and found something that will need to change semantically (besides possibly the ordering)

∀x: Promise.resolve(x) ≡ Promise.resolve(Promise.resolve(x))

see https://mail.mozilla.org/pipermail/es-discuss/2015-April/042517.html

@domenic
Copy link
Member

domenic commented Apr 17, 2015

I think the following example is also symptomatic of the problem, without any recursion involved:

const x0 = new Promise(r => global.resolveX0 = r);

const x1 = Promise.resolve(x0);
const x2 = Promise.resolve(x1);

The reference chain here, even after GC, becomes global.resolveX0 -> x0 -> x1 -> x2. Wheres we would like x1 to be GC-able, so that we have global.resolveX0 -> x0 -> x2.

This occurs because x2 = Promise.resolve(x1) translates to x1.then(resolveX2, rejectX2) (so x1 -> x2) and const x1 = Promise.resolve(x0) translates to x0.then(resolveX1, rejectX1) (so x0 -> x1).

Obviously you can add arbitrary intermediate levels to the chain.

@domenic
Copy link
Member

domenic commented Apr 18, 2015

@ForbesLindesay I do not understand (probably because I am missing something obvious) how your solution in tc39/proposal-cancelable-promises#1 will work in the following case:

const x0 = new Promise(r => global.resolveX0 = r);

const x1 = Promise.resolve(x0);
const x2 = Promise.resolve(x1);

x2.then(v => console.log(v));

If I read your solution correctly, the reference graph set up here is

global.resolveX0 -> x0
x1 -> x0
x2 -> x0

However after a GC run, all non-rooted chains will go away, resulting in

global.resolveX0 -> x0

i.e. the handler will never be executed.

But this must be wrong, because otherwise we have created a way of observing GC behavior in JavaScript, which is not otherwise possible. So I am missing something obvious. Please help :)

@domenic
Copy link
Member

domenic commented Apr 18, 2015

Oh, I see, I missed https://github.com/domenic/cancelable-promise/pull/1/files#diff-6230eb7fbf1dd2186bf1be6042c27f55R389, which essentially forwards the call x2.then(v => console.log(v)) to x0. So the graph is really

global.resolveX0 -> x0 -> (v => console.log(v))
x1 -> x0
x2 -> x0

domenic added a commit to tc39/proposal-cancelable-promises that referenced this issue Apr 20, 2015
This version accomodates overwritten .then's, but breaks a few Promises/A+ tests.

It strategy for the scenario at promises-aplus/promises-spec#179 (comment) is:

- Add a [[PromiseFollowing]] slot to all promises
- Resolving x1 to x0 notices that x0 is a promise without any value for its [[PromiseFollowing]] internal slot, so:
  - Do the normal thing, i.e. `x0.then(resolveX1, rejectX1)`
  - Save x1@[[PromiseFollowing]] = x0
- Resolving x2 to x1 notices that x1 is a promise with a value for its [[PromiseFollowing]] internal slot, so:
  - Follow the chain of [[PromiseFollowing]] to get a "real" resolution value of x0
  - Now do as before, i.e. `x0.then(resolveX2, rejectX2)` and x2@[[PromiseFollowing]] = x0.

Thus the reference graph is ... fuck, it's just x0 -> x1, x2 so x1 stays in memory.

This breaks some Promises/A+ tests which are coded in terms of a given thenable only having its `then` property accessed once.
@dr-js
Copy link

dr-js commented Mar 16, 2020

The issue title is a bit misleading, the tail-recursive code used for promise chain setup should be the problem.

I've done some test and the full comment with node test code is at: nodejs/node#6673 (comment). And here's some quote:

From what I tested, the pattern for promise memory leaking must include a tail-recursive setup, the promise creating function always recursively appear in itself's promise.then code.
The recursive code cause the promise implementation need to keep extra reference/stack info so the GC can not claim those already-resolved promise from the chain.

If we build a long/infinite promise chain gradually without the recursive code pattern, and only keep reference to the tail of the chain, the GC can took those un-referenced promise from the chain, whether the promise is pending or not.

So I think chaining promise up is not always unsafe, it depends on the actual code.
And if the promise version of while (true) { /* async code */ } is needed, better use a async function instead of the tail-recursive promise pattern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests