Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

winnow + vip enhancement #913

Open
reidsunderland opened this issue Jan 31, 2024 · 4 comments
Open

winnow + vip enhancement #913

reidsunderland opened this issue Jan 31, 2024 · 4 comments
Labels
enhancement New feature or request Refactor change implementation of existing functionality. wishlist would be nice, not pressing for any particular client.

Comments

@reidsunderland
Copy link
Member

reidsunderland commented Jan 31, 2024

When running a winnow on a cluster, you have to use a VIP to ensure that duplicates get suppressed.

If you don't use a VIP, each node will have different nodupe caches and duplicates could get through if the first fileA message goes to node1 and the second (duplicate) fileA message goes to node2.

Currently on v2 and sr3, the node with the VIP subscribes and posts messages and the other nodes do nothing. If the VIP switches to a different node, the nodupe cache on that node will be empty and there's a chance duplicate messages could be posted.

An improvement would be to make this work more like poll does now.

All nodes would need to use a unique queue subscribed to the source exchange.

Node with the VIP:

  • Subscribes to the source exchange
  • Populates nodupe cache
  • Posts non-duplicate messages to the post_exchange

Nodes without the VIP:

  • Subscribes to the source exchange
  • Populates nodupe cache
  • Do not post anything to the post_exchange
@reidsunderland reidsunderland added enhancement New feature or request wishlist would be nice, not pressing for any particular client. Refactor change implementation of existing functionality. labels Jan 31, 2024
@petersilva
Copy link
Contributor

This is the same as v2 behaviour... not a regression but a opportunity for improvement.

@petersilva
Copy link
Contributor

Current method sends duplicates when the vip changes owners. Idea/goal is to fix that.

  • minimize the number of duplicates notifications sent. (ideally 0)

  • minimize the number of unique notifications not sent. (ideally 0)

vip voting scheme can put significant time periods (minutes) where the vip is either in the wrong place, or no-place while a transfer is in progress.

Method 1: separate queues, vip gates posting.

  • Same as @reidsunderland 's post above.
  • pick different queues for every participating instance in a vip winnow.
  • have it consume from the queue all the time, but not having the vip prevents it from posting.

SWOT:

  • S: simple works exactly like a normal subscriber, easy to understand, vip only controls publish.
  • W: when things are bad, you may lose messages that the healthy node processes before the sick node loses the vip and hands it over to the healthy node.

Method 2: poll style.

  • use a common queue for all members as today... but have a second queue bound to the output exchange (like the poll.)
  • when you don't have the vip, consume from the output of the winnow... so you know what was posted, and piopulate duplicate suppression cache.
  • when you have the vip consume from the input queue instead of the output queue.

SWOT:

  • S: while vip is in motion, the unpublished data stays in a shared upstream queue.
  • W: if the one with the vip consumes messages but doesn't publish them, they are still gone (as will happen when dying.) should be much fewer than previous case?
  • W: convoluted flow... a queue you only consume from when you have the vip? a separate queue you only consume from when you don't have the vip?

observations...

  • As soon as you have a vip in the story... not clear whether the failover can be perfect. you either send a few duplicates, or drop some messages. For our use case, I think duplicates are less of a problem than dropping. (this falls from CAP theorem, and voting algorithms.)

@petersilva
Copy link
Contributor

petersilva commented Feb 2, 2024

Method 3: one queue with two bindings... then use the exchange to differentiate input from output.

  • only posts from the output exchange are used to populate duplicate suppression cache.
    hmm...
  • read from the queue with inputs and outputs all the time. (Exchange field tells them apart.)
  • have a matching queue (within a plugin), where stuff read from the input queue is held until you see a corresponding output exchange result (if it makes it past the duplicate suppression cache.)

This would address the weakness of both other methods where if a node is slowly dying, the failover node will start queueing up stuff that it hasn't seen in the output queue, and when it gets the vip, it will catch up.

This works if the input and output exchanges are on the same broker, so that a single queue can have bindings to both.

@petersilva
Copy link
Contributor

method 4: nodupe.sync class...
gather() implements a second subscriber. with settings:

is is installed with callback_prepend. has two entry points: gather, and after_accept.

Gather is a normal gather (like gather/message) but for every message gathered, you add a field "m["from_nodupe_sync_cache"] = True.

Then have an after_accept entry point that drops all messages with that field in it, so the cache is primed.

This seems really easy to do... and kind of a general way to explore shared state caches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Refactor change implementation of existing functionality. wishlist would be nice, not pressing for any particular client.
Projects
None yet
Development

No branches or pull requests

2 participants