Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TASK: ChangeProjection should clean up orphaned change records #4998

Open
5 tasks
grebaldi opened this issue Apr 18, 2024 · 1 comment
Open
5 tasks

TASK: ChangeProjection should clean up orphaned change records #4998

grebaldi opened this issue Apr 18, 2024 · 1 comment
Labels

Comments

@grebaldi
Copy link
Contributor

Problem description

The ChangeProjection in Neos.Neos currently misses a lot of cases in which it should clean up after itself. This leads to many orphaned cr_*_p_neos_change records that do not serve any purpose after the workspace they're associated with has been published or discarded.

To reproduce this problem, @bwaidelich has offered a SQL query for counting changes grouped by contentstream state via cr_*_p_neos_change (I'll assume cr_default_p_neos_change from hereon out):

SELECT
  COUNT(*), cs.state
FROM
  cr_default_p_neos_change c
  JOIN cr_default_p_contentstream cs ON cs.contentStreamId = c.contentStreamId
GROUP BY cs.state;

Note

Any Neos 9 instance that is in use will already have orphaned change records. To start fresh, you will need to make sure that every workspace is up-to-date and no workspace contains any changes.

Then it'll be safe to run DELETE FROM cr_default_p_neos_change on your database, you start at a change count of 0. NEVER do this in a mission-critical environment. This is only safe in a development environment that tolerates data loss.

Now, here are the results of @bwaidelich's query for several subsequent operations (all issued by the same user):

  1. Creating a document node
+----------+---------------------+
| COUNT(*) | state               |
+----------+---------------------+
|        2 | IN_USE_BY_WORKSPACE |
+----------+---------------------+

We see 2 changes, one for the document node and one for its content collection. (This may vary obviously depending on the node type)

  1. Creating a content node in the content collection of that document node
+----------+---------------------+
| COUNT(*) | state               |
+----------+---------------------+
|        3 | IN_USE_BY_WORKSPACE |
+----------+---------------------+

We see 1 additional change for the creation of the content node. (Again, this may vary depending on the node type of the content node)

  1. Changing a property of that content node
+----------+---------------------+
| COUNT(*) | state               |
+----------+---------------------+
|        3 | IN_USE_BY_WORKSPACE |
+----------+---------------------+

Nothing has changed. The change record for the content node has been flagged with changed - that's all.

4a. Discarding all changes via the UI

+----------+------------------+
| COUNT(*) | state            |
+----------+------------------+
|        6 | NO_LONGER_IN_USE |
+----------+------------------+

Here it gets interesting: The change count has doubled, but all changes are now associated with content streams that are NO_LONGER_IN_USE.

4b. Discarding all changes via ./flow workspace:discard [workspace-name]

+----------+------------------+
| COUNT(*) | state            |
+----------+------------------+
|        3 | NO_LONGER_IN_USE |
+----------+------------------+

Something different happens here: Again all changes are associated with a content stream that is NO_LONGER_IN_USE. But the count remained at 3. This is because the two publishing mechanism use different commands under the hood. DiscardIndividualNodesFromWorkspace (which happens in 4a) creates an additional intermediate contentstream, which leads to the doubling of the change count.

4c. Publishing all changes via the UI

+----------+------------------+
| COUNT(*) | state            |
+----------+------------------+
|        3 | FORKED           |
|        6 | NO_LONGER_IN_USE |
+----------+------------------+

The same phenomenon as in 4a, but there are also 3 changes in a content stream that is FORKED. I'm at a loss as to why that happens, but I suppose it's a detail in the handling of PublishIndividualNodesFromWorkspace.

4d. Publishing all changes via ./flow workspace:publish [workspace-name]

+----------+------------------+
| COUNT(*) | state            |
+----------+------------------+
|        3 | NO_LONGER_IN_USE |
+----------+------------------+

Same as in 4b.

Expected behavior

  • When a user publishes individual changes, those changes are deleted from cr_*_p_neos_change
  • When a user publishes their workspace, all associated changes are deleted from cr_*_p_neos_change
  • When a user discards individual changes, those changes are deleted from cr_*_p_neos_change
  • When a user discards their workspace, all associated changes are deleted from cr_*_p_neos_change
  • When a user creates a node and then deletes that node without publishing in between, the associated change is deleted from cr_*_p_neos_change (see also: BUG: Publishing individual nodes is impossible when contents were created on a deleted document #4997)

Notes on solution

It will be easier to achieve the desired behavior, when the ChangeProjection is better able to track changes per Workspace rather than per ContentSteam.

This will require the changes proposed in #4996.

@grebaldi grebaldi added the 9.0 label Apr 18, 2024
@grebaldi
Copy link
Contributor Author

This is probably a duplicate of #4813

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

1 participant