Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StreamParser stalling other queries #319

Closed
luca-moser opened this issue Oct 4, 2015 · 11 comments
Closed

StreamParser stalling other queries #319

luca-moser opened this issue Oct 4, 2015 · 11 comments
Labels
Action Required An action is needed by Tedious member known issue for any issue that has been reported or there's a PR with the fix Response needed Response by tedious member is needed

Comments

@luca-moser
Copy link

luca-moser commented Oct 4, 2015

I have an issue with node-mssql: "node-mssql is not truly concurrent on queries".
Issue: tediousjs/node-mssql#217

short description: multiple queries with stream enabled do not run concurrently because the tedious stream parser might block other streamed queries from being processed:

  StreamParser.prototype._transform = function (input, encoding, done) {
    var job, length, offset, result;
    offset = 0;
    this.buffer.append(input);
    if (!this.generator) {
      this.generator = this.parser();
      this.currentStep = this.generator.next();
    }
    job = void 0;
    result = void 0;
    length = void 0;
    while (!this.currentStep.done) {
      job = this.currentStep.value;
      if (!(job instanceof Job)) {
        return done(new Error('invalid job type'));
      }
      length = job.length;
      if (this.buffer.length - offset < length) {
        break;
      }
      result = job.execute(this.buffer, offset);
      offset += length;
      this.currentStep = this.generator.next(result);
    }
    this.buffer.consume(offset);
    if (this.currentStep.done) {
      this.push(null);
    }
    return done();
  };

I am debugging my program with Visual Studio Code and it it looks like the while loop hit inside the stream parser blocks my other queries from being processed or returning any rows.

Can you give me any inside on how the StreamParser is handled when multiple concurrent connections are open?

@arthurschreiber
@patriksimek

@arthurschreiber
Copy link
Collaborator

Hey @XemsDoom

I've read your issue description, and I'm wondering whether your issue is with the way tedious works or with Node.JS in general. Node.js is a single threaded, event based environment, so doing any sort of processing that happens on the CPU always blocks other CPU processing operations.

Regarding the stream parsing code in tedious: the parsing of incoming data is "asynchronously synchronous". This means that data that is (asynchronously) pushed into the stream parser is parsed synchronously. So as long as data is available on the stream, it will be parsed, which is a blocking, synchronous operation.

I've been playing around with making the parsing "asynchronously asynchronous", but existing internals of tedious depend on the existing behaviour and need to be changed.

You can try using an older version of tedious (e.g. 1.11.5) and see if the issue was already existing there. Fixing this is definately on my list, it's just not a very high priority right now.

@luca-moser
Copy link
Author

Guten Abend @arthurschreiber. I know how NodeJS works and that CPU intensive JS code will block any other processing since everything runs in a single-thread.

In my opinion there's a bug in tedious or a wrong design decision, because im pretty sure it is not supposed to function as it does now. Multiple streamed queries must not be processed sequentially. By "sequentially" I mean the queries being processed after the one before has finished.

If one query has a lot of rows, let's say 100k and a second query 50k rows, the queries should be processed by the stream parser intertwined as rows are fetched from the database server.

Why? Because if different web requests want to retrieve data from two different tables, I don't want one user to be waiting because of other database queries. We are getting the same dilemma as in traditional web servers with this design.


I try to explain it in german, maybe it is easier to understand :)

Tedious bearbeitet nur eine Query zu einem gegebenen Zeitpunkt, statt alle ausgeführten Queries gemeinsam nebenläufig im Stream Parser zu bearbeiten, momentan werden die anderen Queries nicht bearbeitet, bis die erste Query durch den Stream Parser durch ist, es wird also das Bearbeiten der weiteren Queries blockiert bzw. werden diese somit sequenziell abgearbeitet.

Ergo, führst du 3 Queries aus und hast 3 Callbacks für wenn Rows gefetcht wurden, werden die 2 weiteren Callbacks erst aufgerufen, wenn die erste Query durch ist, obwohl eigentlich durch das Streaming die verschiedenen Queries miteinander bearbeitet werden sollen. Mit Miteinander meine ich, dass alle 3 Callbacks jeweils in einer offensichtlich nicht geordneten Abfolge gerufen werden.

Also zum Beispiel so:
Query 1, Query 2, Query 3 werden nebenläufig ausgeführt. Die Queries werden jeweils in eigenen Connections ausgeführt aus dem Connection Pool von mssql.

Nun sollten die Rows von allen 3 Queries eigentlich gleichzeitig zurück kommen:
row-query1, row-query1, row-query3, row-query2, row-query3, row-query2, row-query2....
Halt ohne Determinismus da wir ja nicht vorhersagen können wie die Resultate zurück kommen.

Tedious aber führt das ganze so aus:
row-query1, row-query1, row-query1.....bis halt fertig....dann row-query2, row-query2, row-query2....bis wieder fertig....dann row-query3, row-query3, row-query3....

Was nicht so sein sollte. Das hat ja nichts damit zu tun, dass NodeJS single-threaded ist, sondern dass der Parser nicht verschiedene Queries konkurrierend bearbeitet.

@arthurschreiber
Copy link
Collaborator

Hey @XemsDoom

Thanks for the through explanation (and the german translation - I'll reply in english so other people understand what we're talking about). 😄

As explained above this is a known but undocumented limitation in tedious right now, and I'm planning to work on fixing it after I fixed the unfortunate performance issues I introduced with the 1.12 release line and the switch to generators.

I have a local branch that changes the parsing to be intertwined, but that breaks other tedious internals that currently rely on events being fired in a specific order in one event loop turn. All of this is fixable, but requires time and thorough testing, as I don't want to do the same mistake I did with the 1.12 release.

I'd tell you to feel free at trying to fix this, but the upcoming changes to improve performance will change the token parsing code a lot, and I don't want to you to waste your time. 😥

@luca-moser
Copy link
Author

Thanks Arthur. Looks like you are already aware of this problem and I'm thankful that you are working on it. Can we please let this issue be open, so that I can see when the problem got fixed? If I can help in any way, just tell me, I would be glad to do so.

Have a nice evening.

@Shockolate
Copy link

@arthurschreiber Is any work being done with this? It's been a year and a half since last reported "working on it"

@Mika83AC
Copy link

I'd also like to know if there is any progress on this topic because this issue is a mayor blocker for writing scaling business applications working with large data.

If the whole app can't accept any new requests because the query() blocks it all, this leads to real worse user experiences.

I'd really appreciate any progress here.

Regards,
Michael

@Blakey
Copy link

Blakey commented Sep 18, 2017

Hi @arthurschreiber,

I am encountering the same (I think ?) issue where a stream will block the entire nodeJS app until the stream is complete, however i am struggling to locate the affected block of code as quoted earlier by @luca-moser. Could you please point me in the correct direction as I would like to see i I can fix it for my particular use case. You mentioned that you have a local branch where you were testing an async async solution - would you be able to share this and the known issues this causes? Thanks!

@spimou
Copy link

spimou commented Dec 17, 2019

Hello. I am having trouble too. I use stream to get large amounts of data and send them in the front-end with websockets. All works well, but tedious is still blocking while streaming. Are there any news? Can we use it in a non-blocking way?

Thanks

@IanChokS IanChokS added Action Required An action is needed by Tedious member known issue for any issue that has been reported or there's a PR with the fix Response needed Response by tedious member is needed labels Dec 18, 2019
@bluefire2121
Copy link

Five and a half years later and the asynchronous synchronous behavior of streams is still not fixed. :-(

@arthurschreiber
Copy link
Collaborator

@bluefire2121 I think #1240 will fix this. Five and a half years later. 😬

@arthurschreiber
Copy link
Collaborator

We just released tedious@11.0.7 to the tedious@next release channel. We'd be happy to hear back from you if this fixes the issue you're describing here.

If no issues come up, this will be promoted to the tedious@latest release channel automatically in a week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Action Required An action is needed by Tedious member known issue for any issue that has been reported or there's a PR with the fix Response needed Response by tedious member is needed
Projects
None yet
Development

No branches or pull requests

8 participants