[IMP] BottomBarStatistics: faster computation of stats #3924

rrahir · 2024-03-26T07:49:18Z

Description:

[IMP] BottomBarStatistics: Memoize cell filtering
To compute the statistics, we filter out the cells by type for each
function. However, there are only 2 different filters to apply for the
6 different functions.

This revision introduces a memoization of that filter to compute it 3
times less.

[IMP] BottomBarStatistics: Lazify statistic evaluation
Each time we fetch the statistics, we compute all 6 of them even though
we only need one of them to display (We need all of them when the
statistics selection menu is open).

This revision lazifies the computation of those statistics.

[IMP] BottomBarStatistics: limit redundant getter calls
While we iterate over the cells to check their visibility, we keep
checking the visibility of cells over which we already know that their
col (or row) is hidden.

This revision adds a memoization of limit the calls to the different
getters involved by memoizing them.

Overall Benchmark

On a demo sheet with makeLargeDataset of 260k cells

_computeStatisticFnResult:
Firefox: 200ms to 80ms
Chrome: 130ms to 42ms

Task: : 3827324

review checklist

robodoo · 2024-03-26T07:49:26Z

Pull request status dashboard.

LucasLefevre · 2024-03-27T15:37:47Z

src/components/bottom_bar_statistic/aggregate_statistics_store.ts

      const evaluatedCells = getCells(fn.types.join(","));
      if (evaluatedCells.length) {
-        fnResult = fn.compute(evaluatedCells, locale);
+        fnResult = lazy(() => fn.compute(evaluatedCells, locale));


we could even make the cell iteration and filter part of the lazy computation

const fnResult: Lazy<number| undefined> = lazy(() => { const evaluatedCells = getCells(fn.types.join(",")); if (evaluatedCells.length) { return fn.compute(evaluatedCells, locale); } return undefined; });

no we cannot. We check the presence of fnResult in the final object to know if we should display the statistics button or not. So we'd end up evaluating everything all the time

LucasLefevre · 2024-03-27T15:39:56Z

src/components/bottom_bar_statistic/aggregate_statistics_store.ts

+    const isColHidden = memoize((col: number) => getters.isColHidden(sheetId, col));
+    const isRowHidden = memoize((row: number) => getters.isRowHidden(sheetId, row));


does it really makes a significant difference ?
I believe the benchmark in the commit message is not the correct one. It's suspiciously the same as the previous commit

It's around 40ms for each improvement, I left out the commas afterwards, I guess I can make it again if you want ...

And if this getter really cause performance issue (which I'm not sure how since it really does nothing), shouldn't it be memoized inside the plugin rather than here ? That way every call to this getter in other plugin/components is faster

I don't know what to say. Don't believe me? then try it for yourself.
The reason I haven't set it everywhere right now is because I don't want to have to invalidate the cache all the time and it's pretty much the only place where it's heavily used.

Sorry I'm a moron and can't make proper commit messages apparently. The gain was around 30° ms. I did another round of benchmark and will update the values accordingly

After the 2 first improvements, the time is mostly spent in isColFolder and isRowFolder (of header_grouping.ts).
I think it would be wise to store on this plugin the list of cols and rows that are folded by sheet, so these getters becomes immediate.

To compute the statistics, we filter out the cells by type for each function. However, there are only 2 different filters to apply for the 6 different functions. This revision introduces a memoization of that filter to compute it 3 times less. Benchmark --------- In Firefox with the largeDataSet of 260k cells, averaged on 10 runs: cells.filter _computeStatisticFnResult --------------------------------------------------- before: 63.8 ms 186.8 ms after: 22.4 ms 138 ms Task: 3827324

Each time we fetch the statistics, we compute all 6 of them even though we only need one of them to display (We need all of them when the statistics selection menu is open). This revision lazifies the computation of those statistics. Benchmark --------- In Firefox with the largeDataSet of 260k cells, averaged on 10 runs: _computeStatisticFnResult ----------------------------------- before: 138 ms after: 95.8 ms Task: 3827324

While we iterate over the cells to check their visibility, we keep checking the visibility of cells over which we already know that their col (or row) is hidden. This revision adds a memoization of limit the calls to the different getters involved by memoizing them as well as skips the visit of cells within columns that are hidden. Benchmark --------- In Firefox with the largeDataSet of 260k cells, averaged over 10 runs: Skipping entire hidden columns _computeStatisticFnResult ----------------------------------- before: 95.8 ms after: 79 ms Skipping hidden columns + memoization _computeStatisticFnResult ----------------------------------- before: 95.8 ms after: 72.4 ms Task: 3827324

rrahir force-pushed the master-stat-fast-rar branch 3 times, most recently from ee53d08 to 60b5111 Compare March 26, 2024 09:02

LucasLefevre reviewed Mar 27, 2024

View reviewed changes

rrahir force-pushed the master-stat-fast-rar branch from 60b5111 to ac34645 Compare March 29, 2024 11:03

rrahir added 2 commits April 8, 2024 16:29

rrahir force-pushed the master-stat-fast-rar branch from ac34645 to 221c397 Compare April 8, 2024 14:30

rrahir force-pushed the master-stat-fast-rar branch from 221c397 to a7dc365 Compare April 8, 2024 14:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IMP] BottomBarStatistics: faster computation of stats #3924

[IMP] BottomBarStatistics: faster computation of stats #3924

rrahir commented Mar 26, 2024 •

edited

robodoo commented Mar 26, 2024

LucasLefevre Mar 27, 2024

rrahir Mar 27, 2024

LucasLefevre Mar 27, 2024

rrahir Mar 27, 2024

hokolomopo Mar 29, 2024

rrahir Mar 29, 2024

rrahir Mar 29, 2024 •

edited

VincentSchippefilt Apr 15, 2024 •

edited

		const isColHidden = memoize((col: number) => getters.isColHidden(sheetId, col));
		const isRowHidden = memoize((row: number) => getters.isRowHidden(sheetId, row));

[IMP] BottomBarStatistics: faster computation of stats #3924

Are you sure you want to change the base?

[IMP] BottomBarStatistics: faster computation of stats #3924

Conversation

rrahir commented Mar 26, 2024 • edited

Description:

Overall Benchmark

review checklist

robodoo commented Mar 26, 2024

LucasLefevre Mar 27, 2024

Choose a reason for hiding this comment

rrahir Mar 27, 2024

Choose a reason for hiding this comment

LucasLefevre Mar 27, 2024

Choose a reason for hiding this comment

rrahir Mar 27, 2024

Choose a reason for hiding this comment

hokolomopo Mar 29, 2024

Choose a reason for hiding this comment

rrahir Mar 29, 2024

Choose a reason for hiding this comment

rrahir Mar 29, 2024 • edited

Choose a reason for hiding this comment

VincentSchippefilt Apr 15, 2024 • edited

Choose a reason for hiding this comment

rrahir commented Mar 26, 2024 •

edited

rrahir Mar 29, 2024 •

edited

VincentSchippefilt Apr 15, 2024 •

edited