Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emits finish dispite still consuming #4

Open
piglovesyou opened this issue Mar 5, 2018 · 5 comments · May be fixed by #6
Open

Emits finish dispite still consuming #4

piglovesyou opened this issue Mar 5, 2018 · 5 comments · May be fixed by #6

Comments

@piglovesyou
Copy link

The following code is a copy of an example in README plus finish event. I assumed the parallel transform emits finish after all data consumption but is didn't.

var stream = transform(10, function (data, callback) {
  setTimeout(function () {
    callback(null, data);
  }, Math.random() * 1000);
});

for (var i = 0; i < 10; i++) {
  stream.write('' + i);
}
stream.end();

stream.on('finish', function () {
  console.log('stream has finished');
});
stream.on('end', function () {
  console.log('stream has ended');
});

output

0
stream has finished
1
2
3
4
5
6
7
8
9
stream has ended

This is not pleasant especially when we use libs such as pump that calls a callback function depending on finish event.

It seems that it emits finish after all writable data is buffered instead of consumed.

This problem occurs also in parallel-stream module.

Also, when we do readable.pipe(parallelTransform).pipe(writable), the writable emits finish earlier than expected.

@moander
Copy link

moander commented May 7, 2018

Any updates on this issue?

@Tapppi
Copy link

Tapppi commented Jul 18, 2018

Also, when we do readable.pipe(parallelTransform).pipe(writable), the writable emits finish earlier than expected.

You should file a bug report on whatever you are building your writable on (unless it is this module, in which case it is obviously the same bug).



Misfiring 'finish'

EDIT: I mistakenly claimed 'finish' should fire before processing is done, and this is definitely a bug which only happens when concurrency > 2.

Unless you need to do something while _flush is still possibly processing, the 'end' event is a drop-in replacement. Regardless, you should probably be using pump, pumpify or end-of-stream anyway rather than trying to handle events.

Anatomy of a Transform stream

-----------------------------------------------------------------------------
  [write() -> Writable stream] -> transform() -> [Readable stream -> read()]
-----------------------------------------------------------------------------

'finish' and 'end' events

From Node docs:

The 'finish' and 'end' events are from the stream.Writable and stream.Readable classes, respectively. The 'finish' event is emitted after stream.end() is called and all chunks have been processed by stream._transform(). The 'end' event is emitted after all data has been output, which occurs after the callback in transform._flush() has been called.

For all relevant parallel transform libs that part should be:

  • finish means that the Writable side of the Transform stream has been drained and closed, i.e. write() does not accept new data and all data has been passed to the transform function so that the "write" buffer is empty
  • end means that the Readable side of the Transform stream has been drained and closed, i.e. all work (including flush function) has been completed and pushed out so no more data will be added to the Readable output buffer

pump etc. correctly wait for both sides of the stream to close, and work correctly with this package. See eos for the utility package pump etc. usually use for detecting end of streams.

Here is a more verbose version of your test script and its output annotated with why something fired.

Test script with more logging

Annotated output
[stream] start 0
[stream] start 1
[stream] start 2
[stream] start 3
[stream] start 4
[stream] start 5
[stream] start 6
[stream] start 7
[stream] start 8
[stream] start 9
// Max concurrency tasks are processing
[stream] stop  2
[stream] stop  4
[stream] stop  8
[stream] stop  5
[stream] stop  9
[stream] stop  3
[stream] stop  7
[stream] stop  0
[logger] start 0
[stream] start 10
// Last item has been given to transform function, 'finish' event can now fire
[logger] stop  0
[stream] stop  1
[logger] start 1 
// The 'finish' event fires after work that was already queued in Node
- [stream] has finished
// !! The event should not have fired since things are still processing
// 'finish': Writable side of the Transform stream has been drained and its buffer is empty
[stream] stop  6
[logger] stop  1
[logger] start 2
[logger] stop  2
[logger] start 3
[logger] stop  3
[logger] start 4
[stream] stop  10
// All events have now been processed by the transform stream
// 'finish' should fire here, not above
- [stream] has ended
// 'end': Processing and Readable buffer have been drained and flush has completed
[logger] stop  4
[logger] start 5
[logger] stop  5
[logger] start 6
[logger] stop  6
[logger] start 7
[logger] stop  7
[logger] start 8
[logger] stop  8
[logger] start 9
[logger] stop  9
[logger] start 10
[logger] stop  10
- [logger] - finished
// 'finish': Writable side of the Transform stream has been drained and its buffer is empty

@Tapppi Tapppi linked a pull request Jul 19, 2018 that will close this issue
@piglovesyou
Copy link
Author

@Tapppi I really appreciate you detailed feedback. @mafintosh I would like you to take a look at this issue, this core bug-fix improves the package even better.

piglovesyou added a commit to piglovesyou/pipeline-pipe that referenced this issue Sep 29, 2019
…i-flush-to-final

I respectfully merged Tapppi/parallel-transform/tree/flush-to-final

Refs. mafintosh#4
Refs. mafintosh#6

# Conflicts:
#	index.js
#	package.json
@Eli-Goldberg
Copy link

This is a big problem, since the node community encourages using 'pipeline' instead of 'pipe',
and 'pipeline' is finished when 'finished' is fired..

@piglovesyou
Copy link
Author

@Eli-Goldberg Hi. pipline-pipe might help you, a forked version of parallel-transform with the fix. Happy if you give me feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants