Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling stop() can cause projections to loose stream positions #233

Open
fritz-gerneth opened this issue Sep 28, 2021 · 3 comments
Open

Comments

@fritz-gerneth
Copy link
Contributor

I have observed this issue over some time, but had a hard time reproducing it. In rare instances during deployments, projections could loose their stream positions and start from the beginning.
This can happen in instances where a pcntl stop handler invokes a projection's stop() method before the projection has loaded the stream positions.

The problematic section is in the run() method, before the actual main loop:

    // Initial State is ProjectionStatus::IDLE() 
   // $this->streamPositions is an empty array

        if (! $this->projectionExists()) {
            $this->createProjection();
        }

        $this->acquireLock();

        if (! $this->readModel->isInitialized()) {
            $this->readModel->init();
        }

        $this->prepareStreamPositions();
        $this->load();                            // << Only here $this->streamPositions will be initialized

Calling stop() from the signal handler before load() loaded the stream positions, causes a call to persist((). The persist call saves the uninitialized stream positions (an empty array).
This issue becomes more frequent / likely the longer it takes the projection to load the stream positions. Putting a sleep() call after the acquireLock() call in the example above makes it rather easy to reproduce (we have replaced the locking mechanism with metadata locks on a 60 second timeout, hence more likely for us to run into this issue).
The current workaround for us seems to be to introduce a new flag $this->streamPositionsLoaded and have an early return in persist() if positions have not been loaded yet.

    private function persist(): void
    {
        if (!$this->streamPositionsLoaded) {
            return;
        }

        $this->readModel->persist();
 // ... 

We are running this in testing at the moment to see if this actually fixes the issue. Any other input / feedback / solutions are welcome though.

@prolic
Copy link
Member

prolic commented Sep 28, 2021

Sounds like a valid solution. Can you submit a pull request please?

@fritz-gerneth
Copy link
Contributor Author

fritz-gerneth commented Sep 28, 2021 via email

@fritz-gerneth
Copy link
Contributor Author

In an additional thought, I think it is also possible to delete/reset/.. the projection before having acquired the lock for it. This can happen if the locking projection (1) is sleeping, status is set to RESETTING, and another instance of the projection (2) is started. Before we acquire the lock in projection2, we reset the projection and set the state to IDLE. When projection 1 wakes up again it checks for state, and does continues to run without knowing it had been resetted.

Projection 1          CLI            Projection 2
  RUNNING
     |
   (sleep)           RESET
     |                                Start Projection
     |                                   |
     |                                Check remote status
     |                                   |
     |                                 Reset, set remote status to IDLE
     |                                   |
     |                              Try to acquire lock (and fail)
     |                                   |
    (wakeup)
     |
 Check remote status (IDLE)
     | 
  Continue to run with previous 
   stream positions

The problematic section is this block here, where we do writing operations before we acquired a lock.
I suggest to remove this block without replacement, and move the same block in the inner loop at the top of the loop (so resets/deletes after projection start happen before we start applying events).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants