Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

champ try_update #271

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

fabianbs96
Copy link

@fabianbs96 fabianbs96 commented Sep 23, 2023

Story

As a user of immer, I want to efficiently build up nested map<set> structures in an incremental way until reaching a fixpoint, achieving maximum performance.
This involves frequent updates of the nested sets inside the map.
The obvious solution for this problem is using the immer::map::update function to update the inner sets.
However, I have found that update always re-allocates -- even if the updated value is identical to the already present value leaving a lot of performance on the table.
This for example happens if inserting an element to an inner set that has already been present.

As a fallback solution, I currently do the following:

const auto *SetPtr = map.find(Key);
if (!SetPtr)
  return map.set(std::move(Key), makeSingletonSet(std::move(Value));

auto NewSet = SetPtr->insert(std::move(Value));
if (NewSet == *OldSet)
  return map;

return map.set(std::move(Key), std::move(NewSet));

Whereas I really want to do:

return map.update(std::move(Key), [Value = std::move(Value)] (const auto &OldSet){
  return OldSet.insert(std::move(Value));
});

Solution Proposal

As a solution to above problem, I propose a new API try_update within immer::map that works similar to update, just adds an additional equality check on the result of the callback fn and in case of equality leaves the map unchanged.

This PR implements try_update on the champ and provides an according public API to immer::map.
In addition, it fixes a minor issue that the champ::update function takes the key by const-ref, although the underlying do_update function can deal with perfectly forwarded keys.

Design decisions:

  • Implement do_try_update and do_try_update_mut within an inner struct allowing to recursively call mentioned functions without specifying the template arguments again.
  • Use the same signature as do_update[_mut] and use a nullptr node as indicator that nothing has changed.
  • Pass the key, the updater-fn and the value-equals-fn by value if they are small and trivial. This adds potential to the optimizer to pass these arguments in registers without touching any memory. Empty arguments, such as std::equal_to for value-equals can even be elided completely.
    As a policy, when to pass by value, I implemented the byval_if_possible type-trait preferring by-value for trivial types that are not larger than two pointers. This more-or-less matches the behavior of the x86-64 parameter passing conventions of the Itanium ABI (refer to https://gitlab.com/x86-psABIs/x86-64-ABI/ chapter 3.2.3 Parameter Passing)

@codecov-commenter
Copy link

Codecov Report

Merging #271 (7eec5f9) into master (5875f77) will decrease coverage by 0.07%.
The diff coverage is 87.40%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@            Coverage Diff             @@
##           master     #271      +/-   ##
==========================================
- Coverage   90.53%   90.46%   -0.07%     
==========================================
  Files         119      119              
  Lines       12144    12379     +235     
==========================================
+ Hits        10994    11199     +205     
- Misses       1150     1180      +30     
Files Coverage Δ
immer/detail/util.hpp 82.60% <ø> (-0.25%) ⬇️
immer/map.hpp 99.09% <100.00%> (+0.07%) ⬆️
test/algorithm.cpp 89.69% <100.00%> (+0.67%) ⬆️
test/map/generic.ipp 99.23% <98.75%> (+0.10%) ⬆️
immer/detail/hamts/champ.hpp 86.16% <80.60%> (-1.00%) ⬇️

... and 1 file with indirect coverage changes

Copy link
Owner

@arximboldi arximboldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Fabian!

Sorry for the very late reply... since this change involves an API change, I had been waiting for a moment where I could give this some thought.

One thing that bothers me a little bit is the potential explosion of variations of update(), whose implementations are already not trivial, complicated in part by their transient versions (which could potentially become even more complicated with some optimizations I've sometimes considered...). We already have update() and update_if_exists(), and arguably, following your argument we would need try_update() and try_update_if_exists() (the latter doesn't seem to be included in this PR).

I am tempted to suggest adding this behavior to plain update() and try_update_if_exists(), but as you may have already considered, this can add a performance penalty in some instances where the trade off may not be desired.

So after some thinking, I think I would like to go on with your proposal of having an additional API for try_update(), but I would suggest basing its implementation on do_update_if_exist. This function is already implemented considering a potential null coming back from the recursion to indicate "no change needed". You can introduce an additional "policy" parameter that customizes the behavior of whether the result value should be compared with the old one.

With this approach, while we would end up with 4 versions of update (8 if we considered their transients counterparts), there would be only two fundamental algorithms in the implementation.

What do you think? Would you mind changing your PR in this way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants