Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

khepri_machine: Run the query anonymous function from the calling process #226

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dumbbell
Copy link
Member

Why

Before this change, the anonymous function passed to khepri_machine:fold/5 was executed by the Ra server process. This had two downsides:

  • it blocked the Ra server to serve writes
  • the anonymous function couldn't query the Ra server itself because it would deadlock

How

The alternative implemented in this patch is to get the whole Khepri tree from the Ra server, then run the anonymous function, this time from the calling process.

The downside is that the whole Khepri tree is copied to that calling process.

@dumbbell dumbbell added the enhancement New feature or request label Sep 27, 2023
@dumbbell dumbbell added this to the v0.9.0 milestone Sep 27, 2023
@dumbbell dumbbell self-assigned this Sep 27, 2023
@codecov
Copy link

codecov bot commented Sep 27, 2023

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (2a20943) 88.89% compared to head (ac306e2) 88.90%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #226      +/-   ##
==========================================
+ Coverage   88.89%   88.90%   +0.01%     
==========================================
  Files          20       20              
  Lines        2890     2902      +12     
==========================================
+ Hits         2569     2580      +11     
- Misses        321      322       +1     
Flag Coverage Δ
erlang-24 87.87% <75.00%> (+0.05%) ⬆️
erlang-25 87.76% <75.00%> (-0.06%) ⬇️
erlang-26 88.52% <75.00%> (+0.01%) ⬆️
os-ubuntu-latest 88.83% <75.00%> (-0.06%) ⬇️
os-windows-latest 88.59% <75.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/khepri.erl 90.57% <ø> (ø)
src/khepri_machine.erl 94.73% <75.00%> (-0.56%) ⬇️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…cess

[Why]
Before this change, the anonymous function passed to
`khepri_machine:fold/5` was executed by the Ra server process. This had
two downsides:
* it blocked the Ra server to serve writes
* the anonymous function couldn't query the Ra server itself because it
  would deadlock

[How]
The alternative implemented in this patch is to get the whole Khepri
tree from the Ra server, then run the anonymous function, this time from
the calling process.

The downside is that the whole Khepri tree is copied to that calling
process.
@dumbbell dumbbell force-pushed the run-query-fun-from-calling-process branch from 3d67a60 to cb52d57 Compare September 27, 2023 08:19
@dumbbell dumbbell marked this pull request as ready for review September 27, 2023 08:24
@dumbbell
Copy link
Member Author

dumbbell commented Sep 27, 2023

I'm still not sure this alternative approach is what we want. Copying the entire Khepri tree to the calling process could be very expensive and the memory footprint could sky-rocket.

Another solution would be to try to run the anonymous function from the Ra process like before. But if it exits with a calling_self exception, then we could re-run it from the calling process. But then, the anonymous function must have no side effects, or the side effects should be idempotent. Perhaps we could accept a {can_be_restarted, true} option, so that the caller lets Khepri know it can re-run the anonymous function.

Or simply, we accept a {run_fun_from_calling_process, true} , it's probably simpler and easier to reason about. This needs a better name though.

What do you think, @dcorbacho and @the-mikedavis?

To give some context to this issue, when RabbitMQ queries transient queues from the database, it queries the Khepri members from a Khepri query anonymous function:

khepri:fold(
  StoreId,
  PathPattern,
  fun(QueuePath, NodeProps) ->
      ...
      %% This queries the Ra server and can't be executed by the Ra server
      %% itself because it would deadlock. gen_statem throws a `calling_self`
      %% exception.
      Members = khepri:members(StoreId),
      ...
  end,
  []).

@dumbbell
Copy link
Member Author

I pushed an additional commit to implement that option that lets the caller decide if it wants the tree to be copied to be able to do nested queries.

I would love some feedback on the approach and the option name :-)

@dumbbell dumbbell marked this pull request as draft September 27, 2023 10:12
@dumbbell dumbbell force-pushed the run-query-fun-from-calling-process branch from a862938 to ac306e2 Compare September 27, 2023 10:41
@the-mikedavis
Copy link
Member

We ended up going a different direction on this in the server similar to the-mikedavis/rabbitmq-server@aed8ddd. We use khepri:get_many/3 to get all of the records under the path and then fold separately with maps:fold/3 so that the fun isn't called within the Ra server process.

@dumbbell should we close out this PR for now? We could re-open it if we find another use-case and want to make it possible to do while still calling khepri:fold/5

@dumbbell dumbbell removed this from the v0.9.0 milestone Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants