Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow down with lots of different contributions (~ @ 700 lines) #1763

Closed
JohnMcLear opened this issue Apr 27, 2013 · 32 comments
Closed

Slow down with lots of different contributions (~ @ 700 lines) #1763

JohnMcLear opened this issue Apr 27, 2013 · 32 comments

Comments

@JohnMcLear
Copy link
Member

https://bugzilla.wikimedia.org/show_bug.cgi?id=46637

Please attempt to replicate and profile.

@ghost ghost assigned JohnMcLear Apr 27, 2013
@rhelmer
Copy link
Contributor

rhelmer commented Apr 28, 2013

I don't see this in Firefox Nightly 23.0a1 (2013-04-27) or 19.0.2 but I do in Google Chrome Version 26.0.1410.65 using http://etherpad-lite.paas.allizom.org/p/kqOi9HF97Z (which is on develop branch c8e2278 right now) on Mac OS X 10.7.5 (2.53 GHz Intel Core i5). I can't get Firefox over 50% CPU but I can easily get the Google Chrome Renderer process to hit 100% (just random scrolling and typing in both cases)

It seems usable in Firefox on this machine, but I'd agree that it's too slow in Chrome to be usable.

@rhelmer
Copy link
Contributor

rhelmer commented Apr 28, 2013

I took a quick at the Chrome Profiler (don't have time to dig into it further just now) and the long pole looks to be in minified source, I think it's jquery - it's spending most if it's time in t.event.dispatch -> s.handle

Is there a quick way to run etherpad-lite locally against unminified jquery so I can see the real stack (not sure what I need to do to get source maps working for this right now, will look into that too)?

@rhelmer
Copy link
Contributor

rhelmer commented Apr 28, 2013

Ah nevermind, I see it in settings.json, will fire it up locally with minification turned off and see if something more useful pops up.

@rhelmer
Copy link
Contributor

rhelmer commented Apr 28, 2013

Here we go, elemData.handle (below) is taking up 32.83% of CPU time on my laptop:

setSelection
  updateBrowserSelectionFromRep
    inCallStack
      inCallStackIfNecessary
        handleKeyEvent
          jQuery.event.dispatch
            elemData.handle

Next down is scrollNodeVerticallyIntoView with 18.38%, then addRange with 6.58%. Everything below that is negligible (under 2%)

@JohnMcLear
Copy link
Member Author

Confirmed also slow on http://beta.etherpad.org/p/cA9CbQlmTY

@JohnMcLear
Copy link
Member Author

Problem exists in 1.2.0 from my tests so doing a bisect probably wont help ;\

@JohnMcLear
Copy link
Member Author

A test spec now exists for this bug. #1764

@JohnMcLear
Copy link
Member Author

This bug existed in the original release of Etherpad

@JohnMcLear
Copy link
Member Author

Commenting out https://github.com/ether/etherpad-lite/blob/develop/src/static/js/ace2_inner.js#L4501 appears to improve performance to the point where the test passes in Chrome and doesn't seem to have any initially noticeable negative side effects but I can't help but to think that something is wrong with that fix, just not sure exactly why we do removeAllRanges first.

Docs for removeAllRanges... https://developer.mozilla.org/en-US/docs/DOM/Selection/removeAllRanges

FF seems fine at handling large pads, Chrome and IE do struggle. IE runs out of memory for the test and is nearly unusable on a large pad.

I think this is going to need some serious investigation to see if we can get around whatever is blocking. I'm pretty much exhausted with my time/resources to throw at this now.

@JohnMcLear
Copy link
Member Author

So I looked to Google Docs for a suggested solution for this and it turns out Google Docs ability to handle long documents in a collaborative environment is even worst.. Users are often disconnected and large pastes often disconnect users ;\ I will try to get a more stable test written but I would suggest actually resolving this might take some sexy magic.

@sdague
Copy link

sdague commented May 2, 2013

This may very well actually be the issue I've seen for 25k+ edits, as those are largely long files. In all cases those pads were 700 - 1000 lines long as well.

@sdague
Copy link

sdague commented May 2, 2013

So I would call this - #1380 a dupe bug

@dlongley
Copy link
Contributor

This bug might be related to: https://bugs.webkit.org/show_bug.cgi?id=92594

That bug makes it sound like the way the editor is currently implemented may result in a page layout for every key press on webkit browsers.

@dirkcuys
Copy link

Does the pull request #1833 help with this issue?

@dlongley
Copy link
Contributor

@dirkcuys, it helps but does not solve this issue. It decreases the rate at which rapid successive keystrokes reduce responsiveness. This increases responsiveness by an order of magnitude in the common case. There are more notes in the PR.

@dirkcuys
Copy link

dirkcuys commented Aug 6, 2013

@dlongley thanks. I tried applying the changes in the pull request on a server running at http://pad.p2pu.org/, but I still get almost unusable performance using Chromium 28.0.1500.71 on Ubuntu 12.04.

Any suggestions? I'm considering downgrading to version 1.1.5? Is that advisable/possible?

@JohnMcLear
Copy link
Member Author

afaik the problem exists in etherpad since day 0.

@dlongley
Copy link
Contributor

dlongley commented Aug 7, 2013

@dirkcuys, downgrading isn't likely to help. If you have enough content in your pads it's going to be slow, period. The PR helps mitigate the problem a bit (the performance degradation rate is slower), but it's not the real fix. The problem can only be solved in one of two ways: a fix for webkit so that it doesn't layout the page so often or a change to the design of etherpad so that it doesn't use a separate DOM element for every character in the pad. Really, both of these fixes should happen at some point. The combination of redoing the page layout for every key press and having a ton of elements that need to be laid out is the killer.

A bug has been filed with webkit to change its behavior so that it does more lazy/consolidated updates, it's just unclear if anyone will get around to working on that any time soon. Here, with etherpad lite, we could take a similar lazy approach by only splitting up content into separate DOM elements when multiple people were editing the same area of a pad. We might see some real performance gains for most large pads in this case, but it is a more complicated design to get right.

@DanBeard
Copy link

DanBeard commented Aug 7, 2013

Is there a separate DOM element for each character? looking at a pad in chrome developer tools looks like there's just a div for each line, then a span for each author's contribution in each line. The only way i could see to cut down on DOM elements would be to combine lines with the same author (or stripped author-less lines) into the same div. The problem would still persist with highly collaborative documents until the author colors were cleared.

That may be why I can't easily reproduce the issue just by myself. I'm using ubuntu and Chromium Version 28.0.1500.71 Ubuntu 13.04 (28.0.1500.71-0ubuntu1.13.04.1), and pasted the whole text of Hamlet (4741 lines in total) into a pad with the PR and, while the lag was noticeable after that paste, it was still very usable.

Of course I was the only one in the pad and all the pasted text was from me, so it makes sense that the performance would be much worse in large documents with a large number of authors per line (Since there would be many more author color spans in the DOM). Could a faster fix be to soft-limit the number of different author color spans in a pad? There could be some periodic background task that could remove the oldest author color spans until we're under the limit.

@dlongley
Copy link
Contributor

dlongley commented Aug 7, 2013

Is there a separate DOM element for each character? looking at a pad in chrome developer tools looks like there's just a div for each line, then a span for each author's contribution in each line.

Yeah, sorry, it is per line. To give an overview of my understanding of what's going on (on webkit browsers):

  1. User presses a key while typing.
  2. Etherpad-lite (ace) adds a letter to the existing span.
  3. Etherpad-lite updates the window selection which causes webkit to layout the page.
  4. Etherpad-lite calculates changes.
  5. Etherpad-lite generates and inserts a new DOM line (div+span+etherpad-lite specific attribute stuff) that includes the just-typed letter and then removes the old one (triggering more layouts).

If you have thousands of lines on the page, then every time you hit a key while typing the above happens very slowly. Deferring page layouts in webkit or doing some DOM combination or more-lazy updating on the etherpad-lite side should correct this problem.

The only way i could see to cut down on DOM elements would be to combine lines with the same author (or stripped author-less lines) into the same div. The problem would still persist with highly collaborative documents until the author colors were cleared.

Combining DOM elements would help, as would delaying updates and redesigning how the updates occur, eg: reducing unnecessary inserts/deletes.

That may be why I can't easily reproduce the issue just by myself. I'm using ubuntu and Chromium Version 28.0.1500.71 Ubuntu 13.04 (28.0.1500.71-0ubuntu1.13.04.1), and pasted the whole text of Hamlet (4741 lines in total) into a pad with the PR and, while the lag was noticeable after that paste, it was still very usable.

The PR may have helped a bit with that too. But, yes, having a pad that has only been edited by a single person will not result in a slowdown as quickly as one with lots of changes by different people. The slowdown increases with the number of DOM elements that need to be laid out.

Could a faster fix be to soft-limit the number of different author color spans in a pad? There could be some periodic background task that could remove the oldest author color spans until we're under the limit.

Yes, it would help mitigate the problem a bit. I'm not sure that's a fix that people who are experiencing this problem would like, however. Even so, if the pad is large enough, a single editor slows things down more than should really be considered acceptable (or as "good performance").

@nemobis
Copy link

nemobis commented Nov 1, 2013

It's been a few months since the (very much appreciated) debugging sprint above: if I understand correctly, the problem has been dissected and identified, right?
What sort of help is needed to make this move? Now that 1.3 was released and corruption problems hopefully solved, this is (again) the main blocker for wide use of etherpad lite at Wikimedia.

@marcelklehr
Copy link
Contributor

The problem seems to be that the browser has lots stuff to do. We could probably optimize a few parts here and there, but unless we find a completely different editor process-flow, the problem is still that there's lots of stuff to do.

What do you do when you want to run many tasks quickly? You run them in parallel. So, an option would be to put some of these tasks in web workers.

Which tasks could be put into a web worker? Nothing DOM-related.

@dlongley
Copy link
Contributor

dlongley commented Nov 1, 2013

Which tasks could be put into a web worker? Nothing DOM-related.

@marcelklehr, I don't think web workers would address the problem here -- particularly because it is (perhaps entirely) DOM-related. This is a solvable problem; it isn't that there are just so many tasks that we simply can't get good performance. The problem is that the editor triggers an unnecessary number of layouts and the layouts take far too long. Both of these issues can be addressed to a reasonable degree by redesigning the core editor code to reduce the number of DOM operations required per edit and reducing the number of DOM elements used in general.

For instance, we can avoid removing an entire DOM line and reinserting a new one for every key press (see point 5). The code should be smart enough to know when you're typing into a span that you already "own" so it doesn't have to do extra work. I assume the current code was written the way it was because it was easy and consistent; it simply hasn't been optimized. However, it may be that some other redesign work could help avoid having to make these sorts of special optimizations altogether. It would be helpful if the original author of that code could comment, but he/she may not be available.

There's not much we can do (without editing webkit code) to improve the layout code itself (note that Firefox doesn't have this same problem), but the frequency of layouts could be reduced.

What sort of help is needed to make this move?

@nemobis, there needs to be a proposal put forth that would change some of the core editor design to avoid extra layouts and to reduce the number of DOM elements used to something more scalable. To solve the problem properly, there's probably a decent amount of work that needs to be done to refactor how the main editor works and preserves authorship in the DOM. The code itself is hard to follow and needs to be reorganized and better commented so design choices (and resulting limitations) are well-understood.

@marcelklehr
Copy link
Contributor

okies. Makes sense.

@nemobis
Copy link

nemobis commented Nov 1, 2013

@dlongley who are the original authors? Can they be added to this ticket?

@JohnMcLear
Copy link
Member Author

A big chunk of this code is inherit from the original Etherpad stack, so I would guess probably not @nemobis

@dlongley
Copy link
Contributor

dlongley commented Nov 1, 2013

@nemobis, what @JohnMcLear said ... I figured as much. It's "legacy" code that I don't think anyone working on etherpad-lite is all that familiar with (at least doesn't have ownership over). Perhaps for that reason alone it needs some attention and refactoring (and commenting) so a situation like this one doesn't arise again where it's difficult for anyone to take it over without a lot of effort. I'd say that's the biggest barrier to getting a fix. I'd like to go in and fix it and do some redesign work, but I simply don't have the time right now.

@JohnMcLear
Copy link
Member Author

@dlongley Is this something you might get some time to review soon?

@JohnMcLear JohnMcLear removed their assignment Nov 1, 2014
@dlongley
Copy link
Contributor

dlongley commented Nov 1, 2014

@JohnMcLear, I think there are some areas to look into for improving the situation (as noted in #1763 (comment)), unfortunately, it's unlikely that I'll personally have much time to address them until next year at the earliest. I've got a number of deadlines on other projects to deal with this year.

Something else the community here could do is lobby hard to get these bugs fixed:

https://code.google.com/p/chromium/issues/detail?id=423170
https://code.google.com/p/chromium/issues/detail?id=138439
https://bugs.webkit.org/show_bug.cgi?id=92594

Getting those fixed might bring performance in Chrome on par with Firefox. Ultimately, it would be best to address the above mentioned design issues, but getting these blink/webkit bugs fixed could go a long way towards improving performance.

@dlongley
Copy link
Contributor

dlongley commented Nov 1, 2014

From that first chromium bug -- see this fiddle: http://jsfiddle.net/eJeQC/

I'm pretty sure that's the same problem (with the same cause) we see here with etherpad. Note it's fast in Firefox (alerts with ~0ms when you click in the result area) and very slow in Chrome (~100ms+, depending on hardware).

@JohnMcLear
Copy link
Member Author

JohnMcLear commented Mar 13, 2021

I spent a good few hours replicating and testing this and created a pad which is available here: https://github.com/JohnMcLear/longpad1763

TLDR; I can't replicate in modern FF or Chrome.

Do we have good enough test coverage? No.

I also went through the comments on this thread that were mostly missing the point, suggestions like "replacing the editor" to fix this bug were not useful.

@ether ether deleted a comment from tanyo520 Mar 13, 2021
@ether ether deleted a comment from dlongley Mar 13, 2021
@ether ether deleted a comment from dlongley Mar 13, 2021
@ether ether deleted a comment from dlongley Mar 13, 2021
@ether ether deleted a comment from dlongley Mar 13, 2021
@ether ether deleted a comment from iuriguilherme Mar 13, 2021
@JohnMcLear
Copy link
Member Author

#4944 has WIP tests for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants