Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long Polling only happening every 25 seconds? #157

Open
smtlaissezfaire opened this issue Jan 2, 2018 · 22 comments
Open

Long Polling only happening every 25 seconds? #157

smtlaissezfaire opened this issue Jan 2, 2018 · 22 comments

Comments

@smtlaissezfaire
Copy link

I'm seeing long polling events getting chunked and the js client only receiving all of the messages on the channel, but only every 25 seconds.

If, on the other hand, I run something like this in a console:

MessageBus.subscribe('/my_channel') do |msg|
  puts msg
end

I see the results immediately.

I couldn't find any config variables that were set with 25 seconds. Any idea why this would be the case?

I tried setting

MessageBus.enableLongPolling = false;
MessageBus.callbackInterval = 1500;

in which I start to get messages "immediately" (every second and half).

Also, I tried setting MessageBus.alwaysLongPoll = true; and MessageBus.enableChunkedEncoding = false; without any change in behavior.

I can confirm that this happens in both safari and chrome on OS X (haven't tried FF yet).

I'm integrating into Sinatra, calling use MessageBus::Rack::Middleware in my config.ru and using thin v1.7.2. Using message_bus gem version 2.1.1.

@SamSaffron
Copy link
Member

What web server are you running?

@smtlaissezfaire
Copy link
Author

I was using thin, but I'm open to switching. This is an app that's just for personal consumption.

@SamSaffron
Copy link
Member

SamSaffron commented Jan 3, 2018 via email

@pierreozoux
Copy link

@SamSaffron you'll not like it, but well, I have to try ;)

So, as you know, I'm running my own Docker image for discourse, and not the official tool.
By starting discourse with rails s I'm starting puma.

As a user in the admin interface, I see these long polling of 25s.
As a user in the admin interface, when I change, say the title of the discourse instance, if I reload the page, then there is no change, even if the PUT is 200.
But if I wait this long polling to happen, then, yes, the change is persisted.

I think it is the same bug, but I might be wrong in my analysis.
If it is the same bug, what do you recommend me to do?

Thanks a lot for your help!

@SamSaffron
Copy link
Member

Quick question ... what browser are you using? Does the same issue happen in other browsers?

@unteem
Copy link

unteem commented Dec 7, 2018

@SamSaffron taking over @pierreozoux so this happens independently of the browser ( tested on firefox and chromium)

So each time I make change in the admin interface I need to wait at least 25 seconds and even if I wait it can get a bit random.

For instance, if I reload without cache, I go back to the initial parameter, reload again I see my change, reload again back to initial parameter.

We are also experiencing strange behaviors when we change the logo. Sometimes its the right logo, sometimes its discourse logo. I don't know if that is related

As Pierre mentioned we are using our own docker image

Thanks for your help

@SamSaffron
Copy link
Member

SamSaffron commented Dec 8, 2018 via email

@unteem
Copy link

unteem commented Dec 19, 2018

@SamSaffron nginx buffering is off, I thought it could come from haproxy that is in front of our nginx but actually the issue is fixed if I run discourse with unicorn and not puma

@SamSaffron
Copy link
Member

SamSaffron commented Dec 19, 2018 via email

@pierreozoux
Copy link

pierreozoux commented Dec 19, 2018

For us the issue is closed, we'll not investigate further, I think @unteem spent 2 full days on that :) I guess we should open the issue on puma, but we don't have resources :/

I'd be curious to know why discourse is using unicorn and not puma :) (maybe for that kind of reasons)

And thanks a lot for your kind support!

@SamSaffron
Copy link
Member

@pierreozoux I know for sure that both @schneems and @evanphx care dearly about making puma as robust as possible and they definitely want the hijack implementation not to stall, this is critical for web sockets and other issues.

If @unteem has any kind of repro here it would be very handy and save others multiple days of debugging.

As to why Discourse use unicorn and not puma in clustered mode, I guess we very much like the fact that rogue requests just take out a single worker and single request rather than taking out potentially a large number of unrelated requests, plus the automatic shielding against rogue gems that don't release GIL in c extensions is nice. Unicorn has treated us nicely, but yeah memory is a challenge. However rack hijack having issues was never a factor in our decision to use unicorn vs puma.

@evanphx
Copy link

evanphx commented Dec 19, 2018

The puma hijacking implementation explicitly performs no buffering so I'm unsure why you'd see a 25s delay, but I don't know much of anything about the message_bus code. I'm happy to look at any specific usage of the hijacking to try and see if there could be an issue though.

@SamSaffron
Copy link
Member

@evanphx

We just write direct to the socket as stuff happens using chunked encoding:

https://github.com/SamSaffron/message_bus/blob/master/lib/message_bus/client.rb#L226-L257

Not sure this is a specific bug in puma though until we make some sort of test harness that works in unicorn/passenger and fails in puma which I was hoping @unteem could help with.

@evanphx
Copy link

evanphx commented Dec 19, 2018

That all looks just fine. Those write calls hit the socket directly and the data is sent without buffering. Probably the best bet is to use tcpdump to try to verify that the data is being sent back to the client properly though.

@pierreozoux
Copy link

@evanphx send us your public ssh key by mail or here, I'll give you access to a VM with a reproducible test based on discourse.
Contact@indie.host

@kevin-klein
Copy link

I had similar issues with puma. When i disabled clustered mode, it suddenly started working.

@pierreozoux
Copy link

Sorry, forgot to comment back, we solved it by using unicorn back.

@chriscz
Copy link

chriscz commented Mar 19, 2019

Sorry, forgot to comment back, we solved it by using unicorn back.

So using unicorn instead of puma?

@tobymao
Copy link

tobymao commented May 9, 2020

i’m hitting this issue as well on puma, and more info on this?

for me i was running puma default, 0:16 and over time notifications would go from instant to to 25 seconds. i’ve since switched to unicorn as well

@tobymao
Copy link

tobymao commented Jun 1, 2020

an update, i'm hitting this on unicorn, again 25 second delays on some workers. if i had to guess it's just due to some kind of leak, perhaps threading related. in order to counter act this, i'm going to use unicorn worker killer

@anthotsang
Copy link

I'm wondering if people with this issue are not calling MessageBus.after_fork as mentioned here?

I'm running Puma in clustered mode and I was having this issue until I added the aforementioned code to my puma configuration. Removing the above code also reliably causes delays.

Upon re-reading the instructions, it does mention that this can solve non-immediate delivery for forking/threading app servers, but it seems like it should be recommended as standard configuration rather than as optional? I probably overlooked it initially because of the way it was written.

@tobymao
Copy link

tobymao commented Jun 3, 2021

This was not the case for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

9 participants