Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DualListTreeSelect performance hit with large data sets #1151

Open
dlabrecq opened this issue Oct 19, 2021 · 7 comments
Open

DualListTreeSelect performance hit with large data sets #1151

dlabrecq opened this issue Oct 19, 2021 · 7 comments
Assignees
Labels
Investiage Possible bug with unknown cause PF4 PF4 pull request

Comments

@dlabrecq
Copy link

dlabrecq commented Oct 19, 2021

Scope:

pf4-component-mapper
dual-list-tree-select

Description

Cost management users are encountering latency with our settings page when we have large data sets. For our cost-demo user on stage-beta, we have 40,000 GCP items under one folder.

I tested the page loading time for 1k, 5k, 10k, 20k, and 40k items. For 40k selected items, we're seeing page loads of 55 seconds.

  • 40k is about 55s
  • 20k is 12s
  • 10k is 6s
  • 5k is 3s
  • 1k is less than 2s.

That said, the DDF component appears to be converting the options provided via our schema and applies a default sort. There is also a lot of recursion happening with the selectedOptions function.

See https://github.com/data-driven-forms/react-forms/blob/master/packages/pf4-component-mapper/src/dual-list-tree-select/dual-list-tree-select.js#L10

Note that the non-tree version of dual-list has an isSortable prop, but the tree version does not. If we can omit the default sort (e.g., using isSortable={false}), we may see a significant performance increase.

Ideally, we would pre-sort server-side via the Cost Management settings API.

Schema

See https://console.stage.redhat.com/api/cost-management/v1/settings/

@Hyperkid123
Copy link
Member

@dlabrecq do we have a baseline when using the "stock" PF select?

@Hyperkid123
Copy link
Member

I will check it properly later and create examples in code sandbox however. From past experience with huge data sets in other components (array fields in MUI mapper), usually, the rendering is the "problem" when it comes to first renders. Removing the sort function may and probably will shave a bit of time from the first render, but I would guess that 90% of the time would be spent on rendering, not sorting some array.

To put it simply, there might be just too much stuff for the browser to handle properly. To boost the performance to some reasonable levels, we would probably need a virtualized list in the component.

@dlabrecq
Copy link
Author

dlabrecq commented Oct 19, 2021

Like you said, omitting the sort will help. However, there is a lot of recursion happening with the selectedOptions function. I think that's probably the underlying problem.

Looking at the settings API, we're providing a list of 40k initial options via the schema. Then, we're providing a chosen (selected) list of 40k values.

The API could easily generate the left and right options server-side, instead of recursively generating the PF options client-side? Ideally, it would just be a pass through to the PatternFly dual-list component.

@Hyperkid123
Copy link
Member

Hyperkid123 commented Oct 19, 2021

@dlabrecq yeah I think this is the rendering issue. Using stock PF component in this example: https://uenic.csb.app/

The initial render time is kind of difficult to judge because of the environment (I will do it locally at some point), but when you try to select any option, you see a lag of 1-3s before the option style changes. And there are only 10k items. (In the meantime, moving all items to right/left will give you probably a close comparison to the initial render time. It just kills the page.)

The API could generate the left and right options server-side, instead of recursively generating the lists client-side?

I am definitely all for offloading the work to server, but we will still have an issue when actually interacting with the PF component.

However, there is a lot of recursion happening with the selectedOptions function

We will debug that and see if we can make the function faster

@dlabrecq
Copy link
Author

dlabrecq commented Oct 19, 2021

Yes, I agree that PF also has performance issues. The PF team is looking to improve the component on their end as well.

However, the Cost Management settings page currently renders 40k items in the available list in 600ms, so it appears to be doable. It's only when we add those same items to DDF's chosen list, when the page loads in 55 seconds. That's when I see all the recursion with the selectedOptions function.

The lag when selecting an option is definitely a PF issue.

Thanks for looking into it.

@Hyperkid123 Hyperkid123 self-assigned this Oct 20, 2021
@Hyperkid123 Hyperkid123 added Investiage Possible bug with unknown cause PF4 PF4 pull request labels Oct 20, 2021
@Hyperkid123
Copy link
Member

Hyperkid123 commented Oct 20, 2021

@dlabrecq I was doing a little bit of research about the issue. Here are my findings.

Testing env:

  • dualist select component with sortable and tree props set to true (required to access the pointed out convertOptions function)
  • 10000 input options sorted prepared in the worst-case scenario. Items are in reverse order
  • using chrome profiling tools to get timings

Initial render profile summary

Screenshot from 2021-10-20 11-57-18

The initial load takes around 32s with the input data

Screenshot from 2021-10-20 11-58-26

  • Over 50% of the time (15.4 s) is taken by the layout activities (rendering)
  • the data preparation phase takes 26% (7.6s)

Convert options performance

The convert options function pointed in the issue description is not the problem here.

Screenshot from 2021-10-20 12-01-18

  • total execution time is 1118.9 ms (1.1s) which is roughly 3.9% of the whole execution time

Here is a total execution time comparison. The highlighted segment is the convert function execution portion.

Screenshot from 2021-10-20 12-30-34

Most expensive function calls

The most resource-heavy function operations are react/react-dom routines

Screenshot from 2021-10-20 12-04-15

These are direct DOM manipulation functions that are directly tied to a component implementation.

Profiling with sort and three props disabled and input data in sorted order

To dismiss any potential profiling mistakes, I have also done a run with the sort option disabled. The convertOptions function is not called in this scenario. the results are a bit better but the relative resource allocation is basically the same.

Screenshot from 2021-10-20 12-08-01

You can see the sorted list of activities that references the same functions

Screenshot from 2021-10-20 12-08-52

And the convertOptions is not present in the list
Screenshot from 2021-10-20 12-09-05

Conclusion

The convert options are not responsible for the performance degradation. In fact, the impact is negligible when compared to the rest of the factors.

In my opinion, an implementation change of the dual list selector component to use the virtual list is the only option to fix the issue.

@dlabrecq In case that I have missed something, I would ask you to provide me with the exact dataset you were testing so I can run the profiling again. You can share it in gist snippet or send me a private message if you don't want the data to be public.

EDIT: If you want I will provide the profiling data so you can open and inspect them in your browser.

@dlabrecq
Copy link
Author

We've mainly been testing this page:
https://console.stage.redhat.com/beta/settings/applications/cost-management

With the DDF schema from:
https://console.stage.redhat.com/api/cost-management/v1/settings/

I can forward the user/pwd offline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Investiage Possible bug with unknown cause PF4 PF4 pull request
Projects
None yet
Development

No branches or pull requests

2 participants