Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pads silently disconnect when browser suspends a background tab #3830

Open
ryanpitts opened this issue Apr 3, 2020 · 29 comments
Open

pads silently disconnect when browser suspends a background tab #3830

ryanpitts opened this issue Apr 3, 2020 · 29 comments
Assignees

Comments

@ryanpitts
Copy link

My team uses etherpad-lite for a lot of internal planning work and to document open community meetings, so we often have multiple pads open in different tabs for extended periods of time. Many times when a pad loses connection (because someone shut their laptop and their machine went to sleep, or because the browser offloads tabs that have been in the background for a while), the pad will show a "disconnected" prompt. But not always.

We regularly experience silent disconnections—if you happen to notice and reload, the pad reconnects just fine. But if you return to a tab and don't notice that it disconnected, changes that you make aren't stored and it's easy to lose a lot of work. We're totally aware of why a socket might lose connection, but it's unclear why sometimes there's a warning and sometimes there's not.

Is this a known issue? Are we strange in the way we use etherpads? I feel a little awkward about filing this as a bug, but I did a ton of searching and I can't find much description of this problem anywhere.

@JohnMcLear
Copy link
Member

No, you are using Etherpad right and this is not a known issue. You should have a disconnected msg and an overlay. Can you provide some exact steps to replicate?

Thanks for letting us know.

This is the connection status logic: https://github.com/ether/etherpad-lite/blob/develop/src/static/js/pad_connectionstatus.js

And the auto reconnect logic: https://github.com/ether/etherpad-lite/blob/develop/src/static/js/pad_automatic_reconnect.js

@joassouza
Copy link
Contributor

joassouza commented Apr 7, 2020

We had the some problem with a user with unstable internet connection. This bug is quite difficult to replicate and really serious. We've tried to simulate an unstable connection without success.
We are using Etherpad 1.7.5

@JohnMcLear
Copy link
Member

Looking into this now.

@JohnMcLear JohnMcLear self-assigned this Apr 14, 2020
@JohnMcLear
Copy link
Member

JohnMcLear commented Apr 14, 2020

Wait.. In settings.json


  /*
   * Time (in seconds) to automatically reconnect pad when a "Force reconnect"
   * message is shown to user.
   *
   * Set to 0 to disable automatic reconnection.
   */
  "automaticReconnectionTimeout": 0,

This is the default, what's the logic here? Surely that's not a good default setting?! :D cc @muxator any idea what the deal is here?

My method for initial testing is..

  1. connect to pad. Type stuff.
  2. minimize browser.
  3. killall node (makes server disappear)
  4. wait a few mins (I'd love to know what the timeout is here)
  5. tab back to browser, see if I have any msg.

What I expected would happen,.

  1. Disconnected msg would be visible.

What does happen.

  1. Disconnected msg is visible.

Another approach..

  1. connect to pad. Type stuff.
  2. minimize browser.
  3. put laptop to sleep
  4. wake laptop up from sleep
  5. tab back to browser, see if I have any msg.

What I expect to happen:
Go back to browser, see if it will reconnect

What happens:
Try to type and it disconnects user!

Next I'm going to test waiting longer for reconnect..

@luixxiul
Copy link

  "automaticReconnectionTimeout": 0,

It was introduced here: 009cd31#diff-8ab11a170627f11a32a1d642d7114743R126

@JohnMcLear
Copy link
Member

Good news. I can replicate.

  1. connect to pad. Type stuff.
  2. minimize browser. (probably not required)
  3. put laptop to sleep
  4. wake laptop up from sleep
  5. tab back to browser, see if I have any msg.

What I expect to happen:
Go back to browser, see if it will reconnect after x seconds.

What happens:
It claims to be reconnected but typing anything sends bad changeset.

https://www.youtube.com/watch?v=COyju-u9Sek

I think I might be able to reduce test down to just making network disappear using network tools.

@JohnMcLear
Copy link
Member

  "automaticReconnectionTimeout": 0,

It was introduced here: 009cd31#diff-8ab11a170627f11a32a1d642d7114743R126

Okay thanks and note that @lpagliari is on other projects now so wont be able to fix this. I don't think she "caused" the bug but I find that setting just weird.. It feels like it should be on by default...

@luixxiul
Copy link

I don't think she "caused" the bug but I find that setting just weird.

Me too, no offense really :-)

@ryanpitts
Copy link
Author

omg, I'm so sorry, I had this thread open in a tab and a reply typed out and never hit submit!

Quite difficult to replicate is my experience too. I know this kind of sounds ridiculous, but the closest I can get has been something like:

  1. open an etherpad in a tab
  2. leaving etherpad as active tab, close my laptop lid and wait a few seconds for machine to fall asleep and lose network connection
  3. open laptop and revisit tab with etherpad

I generally see the "reconnecting" message in that tab, and sometimes it does, in fact, reestablish connection. But not always, even though the "reconnecting" message disappears. When that happens, there's no visual hint that what you're typing isn't being saved.

A couple things I've wondered about:

  • if my laptop takes a little too long to reestablish a network connection, is it possible that etherpad gives up trying to reconnect to the server in the meantime
  • I'm not sure but I feel like clicking in the etherpad might dismiss the "reconnecting" message ... maybe that cancels it? Or at least dismisses so you'd never see a followup error message

So, maybe my attempt at replicating wasn't as ridiculous as I thought! I'm so glad we aren't the only ones who've seen this. Thank you so much for looking into it.

@JohnMcLear
Copy link
Member

Sorry I have kids dumped back on me. I have to jump off this now.

@JohnMcLear
Copy link
Member

one hand hax.

  1. connect to pad
  2. use clumsy to drop all packets
  3. type smt
  4. dsiable clumsy
  5. after reconnwct tryu type agaiun
    chanfset errpr

@JohnMcLear
Copy link
Member

@JohnMcLear
Copy link
Member

Okay cool I have a patch in that solves a huge chunk of the problem.

  • Doesn't allow edits during reconnecting state.
  • Gets proper rev # of doc prior to submitting next Changeset.

If I can't solve item #2 I can always use the internal reload method to load the pad back into the page.

This should land tonight, dinner time now.

@JohnMcLear
Copy link
Member

JohnMcLear commented Apr 14, 2020

Okay I only have time for a temp fix that reloads the pad upon reconnect, it's not ideal but it works.

It's possible it doesn't fire on stale. I tested disconnect states, unreliable states, sleep / awake states and it seems to behave appropriately.

It is a vastly better UX that what's currently in develop but really the pad contents should reload without an entire page refresh. At least an entire page refresh in Etherpad is cheap but still, it's hacky.

@ryanpitts
Copy link
Author

This is just fantastic, thank you!!

Doesn't allow edits during reconnecting state.

Seriously, just that change right there will eliminate so much heartache. (Well, "so much" is maybe overselling it because the silent disconnect issue feels relatively rare. But when you lose a bunch of work because you thought you were connected, it does feel pretty bad.)

@stale
Copy link

stale bot commented Jun 30, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix Wont Fix these things, no hate. label Jun 30, 2020
@ryanpitts
Copy link
Author

would sure love to see this one land!

@stale stale bot removed the wontfix Wont Fix these things, no hate. label Jul 1, 2020
@JohnMcLear
Copy link
Member

See above the fix was merged.

@ryanpitts
Copy link
Author

Oh great, thank you! Looking forward to 1.8.5!

@rhansen
Copy link
Member

rhansen commented Sep 26, 2020

@ryanpitts @joassouza We recently made some changes to improve handling of reconnects (see #4331). Please keep an eye out for regressions when you upgrade.

@ryanpitts
Copy link
Author

Thanks! I saw that 1.8.6 landed, and I'm planning to upgrade either this week or next.

@rhansen
Copy link
Member

rhansen commented Sep 29, 2020

@ryanpitts: The change that I'm referring to is not in v1.8.6; it will appear in the next release.

A heads up: There are a couple of bugs in v1.8.6 related to sessions. If you use the HTTP API then you will probably need to cherry-pick 3886e95 and 4332aff (or just check out the develop branch).

@dessalines
Copy link

This one popped up again, and seems pretty critical. Using AUTOMATIC_RECONNECT_TIMEOUT only seems to work if that tab has continous focus. Recreate steps:

  • Load an etherpad
  • Do other things on your device for a few minutes, going to other tabs.
  • Make edits (no errors shown, everything seems fine, when it reality it silently disconnected)
  • Other clients can't see your edits. Refresh page, your edits aren't shown.

Version 2.0.2

@SamTV12345 SamTV12345 reopened this Apr 26, 2024
@SamTV12345
Copy link
Member

This one popped up again, and seems pretty critical. Using AUTOMATIC_RECONNECT_TIMEOUT only seems to work if that tab has continous focus. Recreate steps:

  • Load an etherpad
  • Do other things on your device for a few minutes, going to other tabs.
  • Make edits (no errors shown, everything seems fine, when it reality it silently disconnected)
  • Other clients can't see your edits. Refresh page, your edits aren't shown.

Version 2.0.2

I discovered an error in the handling. During testing there was a ping timeout. I am now reconnecting the socket io connection if this is the case.

@nanawel
Copy link

nanawel commented May 25, 2024

Until recently I was using an old 1.8.17 and I never had this issue with it. Sometimes of course the connection went down but I had the popup message clearly visible so I knew I had to reload the tab.

But I've just upgraded my instance to 2.0.3 and although everything else works as expected, I encounter this faulty behavior several times a week now.

I have 2 machines with at least one pad always open on Chromium and sometimes, after leaving the tab open for hours in the background without typing anything, if I get back to it I can still write, no popup appears, but actually the connection went down and reloading the tab reveals that the latest words/lines have not been saved.

I'm using the Docker version, with a properly set up Nginx as SSL-offloader. I did not change the configuration that was working with Etherpad 1.8.

@SamTV12345
Copy link
Member

Until recently I was using an old 1.8.17 and I never had this issue with it. Sometimes of course the connection went down but I had the popup message clearly visible so I knew I had to reload the tab.

But I've just upgraded my instance to 2.0.3 and although everything else works as expected, I encounter this faulty behavior several times a week now.

I have 2 machines with at least one pad always open on Chromium and sometimes, after leaving the tab open for hours in the background without typing anything, if I get back to it I can still write, no popup appears, but actually the connection went down and reloading the tab reveals that the latest words/lines have not been saved.

I'm using the Docker version, with a properly set up Nginx as SSL-offloader. I did not change the configuration that was working with Etherpad 1.8.

Thanks for the answer. If the pad disconnects silently from the socket. Can you check the development console of your browser. I added a message when this occurs because we don't really have a reason until now. It should say something like Reason for disconnect is XY.

@dessalines
Copy link

dessalines commented Jun 4, 2024

There's never a popup or reconnect of any kind, and the console only shows these messages:

Error connecting to pad Error: timeout at manager.js:137:25
Socket disconnected: transport error
https://etherpad.xxx/socket.io/?padId=xxx&EIO=4&transport=polling&t=O_ar8_g&sid=XB5ZmxfT0G2pcK4XAABC 400



@dessalines
Copy link

dessalines commented Jun 4, 2024

edit: ignore my last comment about nginx misconfiguration. The issue is still 100% there.

@dessalines
Copy link

I searched this codebase, and nowhere does it have document.hasFocus(), or even window.onfocus . Has etherpad never handled tab switches or window focus events correctly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants