Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RubyEventStore::Projection does not yield events in deterministic order when using multiple streams #1550

Open
Bertg opened this issue Jan 19, 2023 · 2 comments
Assignees

Comments

@Bertg
Copy link

Bertg commented Jan 19, 2023

Given I have some events in my database:

{id: 1, data: {...}, timestamp: "2023-01-19Z01:01:01"} on stream "A"
{id: 2, data: {...}, timestamp: "2023-01-19Z01:01:02"} on stream "B"
{id: 3, data: {...}, timestamp: "2023-01-19Z01:01:03"} on stream "A"

When I run a projection like so: RubyEventStore::Projection.from_stream(%w(A B)) the events will be returned in order: 1,3,2.
However, if I switch the order of the streams: RubyEventStore::Projection.from_stream(%w(B A)) the events will be returned in order: 2,1,3.

My expectation is that the order of the events should be in the order of the data, regardless of streams. In other words, the order of the stream names in the array should not affect the order in which the events are returned.

Example test in test is added here: Bertg@bcc2cab, and included as example here:

specify "reduce events from many streams in order" do
  event_store.with_metadata(timestamp: Time.utc(2018, 1, 1, 1, 1, 1)) {
    event_store.append(MoneyDeposited.new(data: { order: 1 }), stream_name: "Customer$1")
  }
  event_store.with_metadata(timestamp: Time.utc(2018, 1, 1, 1, 1, 2)) {
    event_store.append(MoneyDeposited.new(data: { order: 2 }), stream_name: "Customer$1")
  }
  event_store.with_metadata(timestamp: Time.utc(2018, 1, 1, 1, 1, 3)) {
    event_store.append(MoneyDeposited.new(data: { order: 3 }), stream_name: "Customer$2")
  }

  account_balance =
    Projection
      .from_stream(%w[Customer$2 Customer$1])
      .init(-> { { order: nil } })
      .when(MoneyDeposited, ->(state, event) {
        puts event.timestamp
        state[:order] = event.data[:order]
      })
      .run(event_store)
  expect(account_balance).to eq(order: 3)
end

This fails with:

  1) RubyEventStore::Projection reduce events from many streams in order
     Failure/Error:

       expected: {:order=>3}
            got: {:order=>2}
@mostlyobvious
Copy link
Member

Hi Bert,

thank you for bringing this topic. The RubyEventStore::Projection.from_stream(%w[]) API is quite problematic at the moment.

In RES there's global stream (ordered by global id) and there are named streams, which can have their own ordering of events — not necessarily the same as global id ordering. Then there are timestamps (created_at, valid_at) that can influence ordering.

Situation is clear when we're reading (in Projection) from a named stream, that is ordered by position. When reading from multiple streams, ordering by position in stream is impossible to enforce (as its scope is within a single stream). So an unexpected change of ordering would have to happen (i.e. ordering by global ids). I'm not saying that current outcome is by any means expected too. We didn't think through such edge cases when designing it.

At the moment it is also not possible to have ordering by timestamps in Projection. Such ordering was introduced in RES Read API but Projection API is currently not using it in such extent. We're planning to reshape how Projection would behave in next major version — here's our current idea, that is taking shape: #1463 + #1552.
In particular we plan to remove the possibility to read from multiple streams.

If you need to converge multiple existing streams into Projection, your best option now is to link events from those streams into the new stream used by Projection. You'll be able to decide the ordering you wish to have for your use case:

events = 
  %w[A B]
    .map { |stream_name| event_store.read.stream(stream_name).to_a }
    .reduce(&:concat)
    .sort_by { |event| event.timestamp }
    .each { |event| event_store.link(event.event_id, stream_name: "my-projection") }

Here's how one can build such per-projection-streams upfront: https://blog.arkency.com/how-to-build-a-read-model-with-rails-event-store-projection/

By the way, I'm curious about your use case. Could you tell us a bit more about what you are modeling with Projection from multiple streams?

@Bertg
Copy link
Author

Bertg commented Jan 20, 2023

By the way, I'm curious about your use case. Could you tell us a bit more about what you are modeling with Projection from multiple streams?

Ah, that's the embarrassing part. We are writing events to a stream name, that contains a typo (simplified reason here). In order to migrate while the application is running we wanted the relevant projections to be able to read from both the "wrong" and "correct" stream name at the same time. This because, for a short time, both of these may be in use, while an other script "corrects" the stream names in the database.

The other approach I was considering:

  • Modify code to start writing to both "wrong" and "correct" streams
  • Start process to link events to correct stream name
  • Start projecting from "correct" stream name
  • Clean up code, and stream names

If you have a better strategy to rename a stream name, I'd love to hear it :)


The code I provided actually works really well. I've tried it in a fresh rails+psql project, and it works great there as well.
The issue I'm seeing happens only in our product, and I'm not sure what is causing it. It doesn't seem related to RES itself.


In particular we plan to remove the possibility to read from multiple streams.

Have you polled your user base to see if this feature is commonly used? I can imagine that for "one off" analysis these are quite useful. Having to first create a new stream, and then query that stream seems a bit overkill.

Is there a particular reason why RES doesn't support something like event_store.read.streams([stream_name_a, stream_name_b]). There should be a possibility of a sensible default here, no?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants