Skip to content
This repository has been archived by the owner on Mar 26, 2024. It is now read-only.

Perf optimization #18

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Conversation

wpc009
Copy link

@wpc009 wpc009 commented Apr 1, 2021

The current web client is sub-optimal, it consuming lots of memory and causing GC a lot.
I did some dirty hacks to improving the client's performance. Now, it can run smoothly on a 4K display.
The hack is not well-designed. So, take what ever you need, just make sure that the web client runs more smoothly.

@SerVB
Copy link
Member

SerVB commented Apr 5, 2021

Hi @wpc009! Thank you for the great amount of work!

Could you share some details how you've found the places to optimize? Also, comparisons of performance before and after will be helpful.

We can check it ourselves but a bit later.

Finally, I think if something is compiled by Kotlin/JS inefficiently, it should be reported to https://youtrack.jetbrains.com/issues/KT and comments near hack in code should be left.

@wpc009
Copy link
Author

wpc009 commented Apr 6, 2021

It true that a huge part of performance issue comes from the compiled JS code(It could be more efficient in pure JS code.). Lots of "isType" checks and "toTypedArray" conversion. Also, the int to string color conversion in fillStyle cause a lot of GC overhead.
I appreciate the great work you have done, and really need this "Projector" solution in real works.
By the way, I'm using the "Dev Tools -> Performance" panel to do the profiling, trying to reduce the render task costs.

@wpc009
Copy link
Author

wpc009 commented Apr 6, 2021

Test scenario:
Open java source code( around 500 lines), scroll up and down a little, and then scroll up to the beginning.

The Current head of Master build

image

The yellow box are all "GC" overhead.

image

These parts can be eliminate by using some pure javascript code and some color string cache to replace the code compiled from Kotlin.

PR
image

The optimized version use less time rendering the open stage. Also, the master branch build feels laggy.

profile details. These are the profile data, you can load them into Dev Tools.

1. refine the rendering order. avoid redundant canvas state sync at the end of `applyClip`.
2. caching font styles, avoid string conversion during rendering.
3. try to avoid `isType` in JS runtime. using extra type enum & type cast.
4. change Kotlin collections to primitive arrays.
SerVB added a commit that referenced this pull request Apr 8, 2021
Comment on lines 197 to 233
if( color != null) {
val ctx = myContext2d
val strCache = JsExtensions.rgbStrCache
//TODO: hacking to avoid expensive Kotlin binary operator and string concat
js("""
if(color.argb){
var a = (color.argb >>> 24) & 0xff
var r = (color.argb >>> 16) & 0xff
var g = (color.argb >>> 8) & 0xff
var b = color.argb & 0xff
ctx.fillStyle = "#" + strCache[r] + strCache[g] + strCache[b] + strCache[a];
}else{
ctx.fillStyle = color.canvasGradient;
}
""")
}
}


@Suppress("UNUSED_VARIABLE")
override fun setStrokeStyle(color: PaintColor?) {
myContext2d.strokeStyle = color?.extract()
if( color != null) {
val ctx = myContext2d
val strCache = JsExtensions.rgbStrCache
//TODO: hacking to avoid expensive Kotlin binary operator and string concat
js("""
if(color.argb){
var a = (color.argb >>> 24) & 0xff
var r = (color.argb >>> 16) & 0xff
var g = (color.argb >>> 8) & 0xff
var b = color.argb & 0xff
ctx.strokeStyle = "#" + strCache[r] + strCache[g] + strCache[b] + strCache[a];
}else{
ctx.strokeStyle = color.canvasGradient
}
""")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure we need this amount of optimizations, including switch to #color notation and rgbStrCache? I've just changed Long to Int in b98edad and the generated code looks quite optimal now:

image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The int to String conversion put a lot pressure on memory. After using str cache less GC is triggered. You could adding "--expose-gc" flag to chrome to view GC events.

@SerVB
Copy link
Member

SerVB commented Apr 8, 2021

I've looked a bit. There are many changes. Do you think we need them all? I just wonder maybe there are only a couple places that can be optimized a lot, and others a just minor enhancements.

I don't want to take all the changes because many make the code less readable.

Comment on lines +36 to +41
override fun context2d(): Context2d {
if( _context2d == null){
_context2d = DomContext2d(myCanvas.getContext("2d",js("{ desynchronized: true }")) as CanvasRenderingContext2D)
}
return _context2d!!
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be left as lazy?

override val context2d: Context2d by lazy { DomContext2d(myCanvas.getContext("2d") as CanvasRenderingContext2D) }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I'm new to Kotlin.

object ExperimentalAPI {
}

public open external class OffscreenCanvas : CanvasImageSource {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that it's not available in browsers other than Chromium, so we can't use it: https://developer.mozilla.org/en-US/docs/Web/API/OffscreenCanvas#browser_compatibility

Copy link
Author

@wpc009 wpc009 Apr 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. Maybe we can create a chromium-only branch. After all, the Electron is chromium inside.

Comment on lines 261 to 264
encoder = SupportedTypesProvider.supportedToServerEncoders.first(),
decoder = SupportedTypesProvider.supportedToClientDecoders.first(),
decompressor = SupportedTypesProvider.supportedToClientDecompressors.first(),
compressor = SupportedTypesProvider.supportedToServerCompressors.first(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaks logic: it now doesn't use the result from server...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will revert this change.

@wpc009
Copy link
Author

wpc009 commented Apr 8, 2021

I made some more optimization on the rendering logic.

  1. The origin code did clip on the first, and must reset all the canvas states at the end of applyClip.
    I re-arrange the clip operation, make it possible to reusing previous canvas state.
  2. The measureText during fillText can be avoided, by caching the rescale ratio for each font size.

kotlinVersion=1.4.30
kotlinExtensionsVersion=1.0.1-pre.146-kotlin-1.4.30
kotlinReactVersion= 17.0.1-pre.146-kotlin-1.4.30
kotlinStyledComponentsVersion=5.2.1-pre.146-kotlin-1.4.30
Copy link
Member

@SerVB SerVB Apr 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update Kotlin and dependencies to 1.4.30 on the next week, it seems there are some performance improvements.

By the way, there is a relatively new mode of compilation called IR. It can give more performant code too. I will try it later, but if you have some time, you can try it yourself. Just add (IR) to build.gradle(.kts) files of JS and multiplatform modules like this:

kotlin {
    js(IR) {  // <-- here it was just "js {"
      //...
    }
}

At least many Kotlin.isType are replaced with JS instanceof in IR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update Kotlin and dependencies to 1.4.30 on the next week, it seems there are some performance improvements.

By the way, there is a relatively new mode of compilation called IR. It can give more performant code too. I will try it later, but if you have some time, you can try it yourself. Just add (IR) to build.gradle(.kts) files of JS and multiplatform modules like this:

kotlin {
    js(IR) {  // <-- here it was just "js {"
      //...
    }
}

At least many Kotlin.isType are replaced with JS instanceof in IR.

just add IR in projector-client-web will cause :

dceTask configuration is useless with IR compiler. Use @JsExport on declarations instead.

FAILURE: Build failed with an exception.

  • What went wrong:
    Task 'browserProductionWebpack' not found in project ':projector-client-web'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

projector-client-web is already some time with IR enabled and that problem solved:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

projector-client-web is already some time with IR enabled and that problem solved:

thanks ,I merger the pr to my 1.6.0 version and it shows ok . I found you have released it to 1.7.0 , I will use the latest version after PRJ-857 fix

@SerVB
Copy link
Member

SerVB commented Apr 8, 2021

Thank you for going on 😀

I've submitted a few reports to Kotlin issue tracker, so feel free to vote for them and maybe even add some info:

I will proceed with you changes hopefully on the next week.

Which changes are vital is still the open question. Could you somehow determine which ones give the most boost?

Also, if it's possible to open a separate PR for every change, I will appreciate it much.

@wpc009
Copy link
Author

wpc009 commented Apr 9, 2021

By the way, the current optimization may break some things. I try to using OffscreenCanvas to improve double buffered render logic, but dose not succeed. The transferToImageBitmap will make the buffer canvas empty. /
Maybe I should revert those change.

@wpc009
Copy link
Author

wpc009 commented Apr 9, 2021

There is still some space for improvements.

  1. offload the websocket communication & serialization/deserialization to WebWorker.
  2. Using WebWorker to render OffscreenCanvas, then dump image bitmap to the front canvas.

The current optimization makes the web client usable. I may not have time to go further.

@zhipengzuo
Copy link

zhipengzuo commented Mar 29, 2022

@wpc009 are you still working to merge it to master ?

@wpc009
Copy link
Author

wpc009 commented Apr 4, 2022

@wpc009 are you still working to merge it to master ?

No, This PR contains lots of Hacks which should not be merge into master.
I propose this PR only to provide some idea on perf improvement.

@zhipengzuo
Copy link

@wpc009 are you still working to merge it to master ?

No, This PR contains lots of Hacks which should not be merge into master. I propose this PR only to provide some idea on perf improvement.

ok .thanks .
one more question ,do you still remember the server version you can build with ?
i want to see the effect.
i tried with some server like 1.0.0 1.1.0 seem can't use .

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants