Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4790 Followup, rethink the optimisations #5011

Open
3 tasks
mhsdesign opened this issue Apr 24, 2024 · 6 comments
Open
3 tasks

4790 Followup, rethink the optimisations #5011

mhsdesign opened this issue Apr 24, 2024 · 6 comments
Labels

Comments

@mhsdesign
Copy link
Member

In #4790 some optimisations were done.

A part of the change caused a critical bug in the beta8 and slowed down performance enormous while replaying ContentStreamWasForked events.

This part will be reverted via #5009 as a hotfix.

But some questions are still open:

Cannot update node with copy on write since no anchor point could be resolved for node cd70f561-2952-40c8-9775-8250db07d66b in content stream d79cacd3-ee68-421a-bbca-f5e56a0348ce

  • if the primary key as proposed by christian is apparently wrong, what else is it if not a compound key of ['childnodeanchor', 'contentstreamid', 'dimensionspacepointhash', 'parentnodeanchor']
@kitsunet
Copy link
Member

kitsunet commented Apr 24, 2024

IMHO replay is not even a valid performance test case, because it's ultimately uninteresting if that takes a bit longer or not. I certainly didn't really test for that. It's more important that the projections are compact and read queries are fast (and workspace creation obvs)

if the primary key as proposed by christian is apparently wrong

Or do we have an error in the projection that just didn't manifest so far because we didn't have this key? Because I really do not know how this should be able to exist twice.

@kitsunet
Copy link
Member

To add some discussion points, as now I got started thinking about all of this, supposedly this is the query to get the parent to a specific child (identified by contentstream, dimensions and aggregate id)

        return $this->createQueryBuilder()
            ->select('pn.*, ch.name, ch.subtreetags')
            ->from($this->contentGraphTableNames->node(), 'pn')
            ->innerJoin('pn', $this->contentGraphTableNames->hierachyRelation(), 'ph', 'ph.parentnodeanchor = pn.relationanchorpoint')
            ->innerJoin('pn', $this->contentGraphTableNames->node(), 'cn', 'cn.relationanchorpoint = ph.childnodeanchor')
            ->innerJoin('pn', $this->contentGraphTableNames->hierachyRelation(), 'ch', 'ch.childnodeanchor = pn.relationanchorpoint')
            ->where('cn.nodeaggregateid = :childNodeAggregateId')->setParameter('childNodeAggregateId', $childNodeAggregateId->value)
            ->andWhere('ph.contentstreamid = :contentStreamId')->setParameter('contentStreamId', $contentStreamId->value)
            ->andWhere('ch.contentstreamid = :contentStreamId')
            ->andWhere('ph.dimensionspacepointhash = :dimensionSpacePointHash')->setParameter('dimensionSpacePointHash', $dimensionSpacePoint->hash)
            ->andWhere('ch.dimensionspacepointhash = :dimensionSpacePointHash');

And now I wonder, why do we join hierachy twice here? It should be the same edge just looked at from different sides, it cannot be two separate edges between parent and child, or do I miss something big here?

@kitsunet
Copy link
Member

kitsunet commented Apr 25, 2024

Another thing, I didn't see this before but this exists: \Neos\ContentGraph\DoctrineDbalAdapter\Domain\Projection\HierarchyRelation::getDatabaseId and uhhh, this suggests it shoudl be unique as well as it is used in place of an identifier within the class for update queries....?

@kitsunet
Copy link
Member

aaand this also assumes the combination is unique: \Neos\ContentGraph\DoctrineDbalAdapter\Domain\Repository\ProjectionContentGraph::findIngoingHierarchyRelationsForNode

@kitsunet
Copy link
Member

I think it would be helpful to figure out what query exactly fails with the primary key applied (or stop replay right after the problematic version without the key) so we can see what hierachy is created, I more an dmore suspect we might actually create invalid data there?

@nezaniel
Copy link
Member

nezaniel commented Apr 25, 2024

I'm a bit concerned as well that the replay didn't work with the imho proper index. To me that looks rather like a broken event stream than an issue with the projection. I'd really like to see the hierarchy relations to the failing node aggregate and how they came to be.

The hierarchy relation double-check was introduced years ago at a time when the graph projection was maybe less mature than now, we could have a look at that. Especially with all the new test cases implemented since then.

As for performance, I agree with @kitsunet; we optimize for read performance here. I'd be up for serious performance testing in the postgres adapter later on, maybe we then can backport some of our insights to the general dbal adapter in a later version.

@ahaeslich ahaeslich added the 9.0 label May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

4 participants