Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle database timeouts from Khepri minority #10915

Draft
wants to merge 23 commits into
base: main
Choose a base branch
from

Commits on May 13, 2024

  1. WIP: Bump Khepri to X

    the-mikedavis committed May 13, 2024
    Configuration menu
    Copy the full SHA
    572ec51 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b8d8a21 View commit details
    Browse the repository at this point in the history
  3. Improve cluster_minority_SUITE

    This is a mix of a few changes:
    
    * Suppress the compiler warning from the export_all attribute.
    * Lower Khepri's command handling timeout value. By default this is
      set to 30s in rabbit which makes each of the cases in
      `client_operations` take an excessively long time. Before this change
      the suite took around 10 minutes to complete. Now it takes between two
      and three minutes.
    * Swap the order of client and broker teardown steps in end_per_group
      hook. The client teardown steps will always fail if run after the
      broker teardown steps because they rely on a value in `Config` that
      is deleted by broker teardown.
    the-mikedavis committed May 13, 2024
    Configuration menu
    Copy the full SHA
    d3832f7 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    e1d785f View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    3fbad38 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    45fb884 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    08572d9 View commit details
    Browse the repository at this point in the history
  8. rabbit_db_queue: Transactionally delete transient queues from Khepri

    The prior code skirted transactions because the filter function might
    cause Khepri to call itself. We want to use the same idea as the old
    code - get all queues, filter them, then delete them - but we want to
    perform the deletion in a transaction and fail the transaction if any
    queues changed since we read them.
    
    This fixes a bug - that the call to `delete_in_khepri/2` could return
    an error tuple that would be improperly recognized as `Deletions` -
    but should also make deleting transient queues atomic and fast.
    Each call to `delete_in_khepri/2` needed to wait on Ra to replicate
    because the deletion is an individual command sent from one process.
    Performing all deletions at once means we only need to wait for one
    command to be replicated across the cluster.
    
    We also bubble up any errors to delete now rather than storing them as
    deletions. This fixes a crash that occurs on node down when Khepri is
    in a minority.
    the-mikedavis committed May 13, 2024
    Configuration menu
    Copy the full SHA
    b1ee0ce View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    0232bce View commit details
    Browse the repository at this point in the history
  10. minor: Correct outdated spec for rabbit_amqqueue:lookup/1

    The clause of the spec that allowed passing a list of queue name
    resources is out of date: the guard prevents a list from ever matching.
    the-mikedavis committed May 13, 2024
    Configuration menu
    Copy the full SHA
    5fce957 View commit details
    Browse the repository at this point in the history
  11. rabbit_db_queue: Bubble up errors in set_many/1 with Khepri enabled

    Previously a failing transaction would go unnoticed. Now we return an
    error tuple.
    the-mikedavis committed May 13, 2024
    Configuration menu
    Copy the full SHA
    bed8267 View commit details
    Browse the repository at this point in the history
  12. rabbit_db_user: Raise instead of 'khepri_tx:abort/1'

    `khepri_tx:abort/1` is only meant for use within a transaction - I
    assume this was a relic of implementing this function with a transaction
    previously.
    
    The only caller already wraps this function in a `try`/`catch` block
    that logs the error and re-raises.
    the-mikedavis committed May 13, 2024
    Configuration menu
    Copy the full SHA
    b8494be View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    6ece1bf View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    e804034 View commit details
    Browse the repository at this point in the history
  15. rabbit_db_exchange: Raise database errors in next_serial/1

    All callers assume that this operation will succeed.
    the-mikedavis committed May 13, 2024
    Configuration menu
    Copy the full SHA
    f0567ee View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    e81b064 View commit details
    Browse the repository at this point in the history
  17. rabbit_db_exchange: Raise Khepri errors instead of throwing in clear/0

    This function is only used by the test suites. A backtrace should make
    the thrown error clearer though.
    the-mikedavis committed May 13, 2024
    Configuration menu
    Copy the full SHA
    1a6489f View commit details
    Browse the repository at this point in the history
  18. rabbit_db_vhost: Declare no-return in create_or_get/3 spec

    Note that we don't refactor the `throw/1` to an `erlang:error/1` since
    it's caught by `rabbit_vhost:add/3`.
    the-mikedavis committed May 13, 2024
    Configuration menu
    Copy the full SHA
    f069001 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    d45cda6 View commit details
    Browse the repository at this point in the history
  20. rabbit_db_vhost: Bubble up database errors in clear/0

    This function is only used by a test suite which matches on the 'ok'
    return.
    the-mikedavis committed May 13, 2024
    Configuration menu
    Copy the full SHA
    ba55fbb View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    be6644e View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    ca55031 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    6add459 View commit details
    Browse the repository at this point in the history