New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory spike on repository delete #5049
base: main
Are you sure you want to change the base?
Conversation
pulpcore/app/tasks/repository.py
Outdated
@@ -236,3 +236,24 @@ def add_and_remove(repository_pk, add_content_units, remove_content_units, base_ | |||
with repository.new_version(base_version=base_version) as new_version: | |||
new_version.remove_content(models.Content.objects.filter(pk__in=remove_content_units)) | |||
new_version.add_content(models.Content.objects.filter(pk__in=add_content_units)) | |||
|
|||
|
|||
def custom_cascade_delete_task(instance_id, app_label, serializer_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a full review, just one thing i stumbled over:
This is a user visible task name. Should we call it generic_cascade_delete
instead? And also maybe move it to pulpcore.app.tasks
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in pulpcore/app/tasks/
, do you mean a different file?
I see your point about it being user visible but there might be a bit of conflict between making it generic (for the user's benefit) and making it specific (for the developer's benefit). Do we actually want to use this for everything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this is more "generic" than "custom". And I think i like this method better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So long as we have to have line 259 and 270 I think "generic" is the wrong word to use. If we can find some way to move that out, then sure.
Until then I should probably take the name custom_repository_cascade_delete_task
from the first approach
I'm going to do some testing to see if the same effect can occur at a significant scale w/ deletion of a single publication or repository version, though, and if it does we probably should try to find some way to make it fully generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could still be just 'pulpcore.app.tasks.repository_delete'.
ee80486
to
40b3f1d
Compare
LOG.info( | ||
f"Level {level} Delete Cascade for {base_model.__name__}: " | ||
"Checking relations for {from_model.__name__}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably convert these to DEBUG prior to merging, but for review purposes it's nice to see exactly what's going on. I can switch them prior to merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to debug
OK, it seems I finally got this working. It required some tweaks beyond the original code, which did not it seems handle some edge cases that we trigger. On the bright side, not only does this reduce the memory spike to a tiny blip, but it also seems to make the delete complete much faster. |
229becd
to
2dececf
Compare
There's one catch, which is why the CI is failing If you look at the actual implementation of the https://github.com/pulp/pulpcore/blob/main/pulpcore/app/models/repository.py#L373 The problem is that the way this implementation works specifically deletes all of those child objects before we get around to triggering the delete on the parent object and therefore the invalidation hook. So it finds nothing and doesn't invalidate the cache. That can be fixed, but it will probably require a carveout for this specific edge case. If we're going down that road anyway, is it worth even going down this road in the first place, as opposed to just writing a manual cascade specific to I have mixed feelings. Of course it would be nice to be able to reuse this anywhere but I'm not personally sold on the idea that this code itself doesn't have a mistake or won't eventually break against some new Django version. A manually written cascade would be much simpler and more straightforwards. |
The reason I don't like the "write manual code" is that the list of things that need to be deleted, is under plugin control - so code that works "today", might fail to delete things tomorrow. Regarding cache-invalidation - shouldn't "invalidate the cache for Distribution X" be handled by the Distribution? Why does the repo-delete do it, instead of Distribution-delete? |
Because how would a distribution know if the repository it is serving was just updated/deleted? Deleting a repository doesn't delete the distribution serving it. Deleting a repository does delete the publication serving it. |
Well in this case the way I'd write it (as in this PR actually) is that we'd still be using the Django delete on the top level object, we'd just delete various bits and pieces first so that the cascade won't need to pick them up. So you can't fail to delete anything, only fail to achieve quite as much efficiency in deletion as desired if you miss anything significant.
Well, that happens too https://github.com/pulp/pulpcore/blob/main/pulpcore/app/models/publication.py#L703-L720 But to answer your question, the |
OK, I see now - repo-delete, does delete the Publications, but does not delete the Distributions because they're set-null on the Repository-FK. Looking at the distribution-code, I'd still expect the cache-invalidation to have happened as a result of https://github.com/pulp/pulpcore/blob/main/pulpcore/app/models/publication.py#L711 though. Note: Distributions are allowed to exist without-a-repo as a result of an actual user request/requirement - "I want to create all my Distributions, and have them returning 404s, before the repos they are meant to serve are sync'd/published". Odd, but it's this way for a reason. |
Apparently not given that the CI fails w/o a cache invalidation that's supposed to happen. Nice test @gerrod3 btw |
Since we're updating distribution.repo to None, directly in SQL, I "assume" that's why the distribution's update-hook isn't being called and doing the invalidaiton. Bah. |
a985d6c
to
98ed1b4
Compare
Django handles cascade deletes itself (as opposed to letting the database handle it) and in doing so it pulls a lot of data out, resulting in a memory spike on repository deletion. The only solution is to write a custom cascade delete that deletes the objects at the bottom of the cascade tree more incrementally. closes pulp#5048
|
||
if not model_relation.on_delete: | ||
pass | ||
elif model_relation.on_delete.__name__ == "SET_NULL": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it help to look for PROTECTED relations first and bail out fast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should look for PROTECTED relations, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also RESTRICT. As it stands, we do clean those up to, but I think only because it happens to be at the very top level, and because of line 152
pulp [None]: pulpcore.tasking._util:INFO: Writing task memory data to /var/tmp/pulp/018e0b90-d4ea-7d36-9a21-d4ca99fa1f68/memory.datum
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.tasking.tasks:INFO: Starting task 018e0b90-d4ea-7d36-9a21-d4ca99fa1f68
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 0 Delete Cascade for RpmRepository: Checking relations for RpmRepository
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.acs.AlternateContentSourcePath'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of AlternateContentSourcePath
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 1 Delete Cascade for RpmRepository: Checking relations for AlternateContentSourcePath
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 1: delete records from AlternateContentSourcePath
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from AlternateContentSourcePath
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.RepositoryContent'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of RepositoryContent
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 1 Delete Cascade for RpmRepository: Checking relations for RepositoryContent
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 1: delete records from RepositoryContent
[pulp] | ('pulp [be07879d2708488ea6cde1ea189f18aa]: ::ffff:127.0.0.1 - admin [04/Mar/2024:22:23:45 +0000] "GET /pulp/api/v3/tasks/018e0b90-d4ea-7d36-9a21-d4ca99fa1f68/ HTTP/1.0" 200 799 "-" "Pulp-CLI/0.23.2"',)
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 167812 records from RepositoryContent
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.RepositoryVersion'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of RepositoryVersion
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 1 Delete Cascade for RpmRepository: Checking relations for RepositoryVersion
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.RepositoryContent'>: on_delete=<function RESTRICT at 0x7f648a33f160>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.RepositoryContent'>: on_delete=<function RESTRICT at 0x7f648a33f160>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.RepositoryVersion'>: on_delete=<function SET_NULL at 0x7f648a33f280>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Executing SET NULL constraint action on RepositoryVersion relation of RepositoryVersion
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Updated 0 records in RepositoryVersion
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.RepositoryVersionContentDetails'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of RepositoryVersionContentDetails
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 2 Delete Cascade for RpmRepository: Checking relations for RepositoryVersionContentDetails
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 2: delete records from RepositoryVersionContentDetails
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 67 records from RepositoryVersionContentDetails
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.Publication'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of Publication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 2 Delete Cascade for RpmRepository: Checking relations for Publication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.PublishedArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 3 Delete Cascade for RpmRepository: Checking relations for PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 3: delete records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 452480 records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.PublishedMetadata'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 3 Delete Cascade for RpmRepository: Checking relations for PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.content.ContentArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 4 Delete Cascade for RpmRepository: Checking relations for ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.content.RemoteArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 5 Delete Cascade for RpmRepository: Checking relations for RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 5: delete records from RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.PublishedArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 5 Delete Cascade for RpmRepository: Checking relations for PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 5: delete records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 4: delete records from ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 27 records from ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.Repository'>: on_delete=None
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.RepositoryContent'>: on_delete=<function PROTECT at 0x7f648a33f0d0>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 3: delete records from PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 27 records from PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.Distribution'>: on_delete=<function SET_NULL at 0x7f648a33f280>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Executing SET NULL constraint action on Distribution relation of Publication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Updated 0 records in Distribution
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulp_rpm.app.models.repository.RpmPublication'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of RpmPublication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 3 Delete Cascade for RpmRepository: Checking relations for RpmPublication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.PublishedArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 4 Delete Cascade for RpmRepository: Checking relations for PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 4: delete records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.PublishedMetadata'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 4 Delete Cascade for RpmRepository: Checking relations for PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.content.ContentArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 5 Delete Cascade for RpmRepository: Checking relations for ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.content.RemoteArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 6 Delete Cascade for RpmRepository: Checking relations for RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 6: delete records from RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.PublishedArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 6 Delete Cascade for RpmRepository: Checking relations for PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 6: delete records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 5: delete records from ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.Repository'>: on_delete=None
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.RepositoryContent'>: on_delete=<function PROTECT at 0x7f648a33f0d0>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 4: delete records from PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.Distribution'>: on_delete=<function SET_NULL at 0x7f648a33f280>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Executing SET NULL constraint action on Distribution relation of RpmPublication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Updated 0 records in Distribution
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 3: delete records from RpmPublication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 4 records from RpmPublication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulp_file.app.models.FilePublication'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of FilePublication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 3 Delete Cascade for RpmRepository: Checking relations for FilePublication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.PublishedArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 4 Delete Cascade for RpmRepository: Checking relations for PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 4: delete records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.PublishedMetadata'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 4 Delete Cascade for RpmRepository: Checking relations for PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.content.ContentArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 5 Delete Cascade for RpmRepository: Checking relations for ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.content.RemoteArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 6 Delete Cascade for RpmRepository: Checking relations for RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 6: delete records from RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from RemoteArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.PublishedArtifact'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 6 Delete Cascade for RpmRepository: Checking relations for PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 6: delete records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from PublishedArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 5: delete records from ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from ContentArtifact
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.Repository'>: on_delete=None
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.repository.RepositoryContent'>: on_delete=<function PROTECT at 0x7f648a33f0d0>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 4: delete records from PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from PublishedMetadata
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.Distribution'>: on_delete=<function SET_NULL at 0x7f648a33f280>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Executing SET NULL constraint action on Distribution relation of FilePublication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Updated 0 records in Distribution
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 3: delete records from FilePublication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from FilePublication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 2: delete records from Publication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 4 records from Publication
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.Distribution'>: on_delete=<function SET_NULL at 0x7f648a33f280>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Executing SET NULL constraint action on Distribution relation of RepositoryVersion
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Updated 0 records in Distribution
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 1: delete records from RepositoryVersion
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 5 records from RepositoryVersion
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.exporter.PulpExporter'>: on_delete=None
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.importer.PulpImporterRepository'>: on_delete=<function CASCADE at 0x7f648a3c3c10>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Cascading delete to relations of PulpImporterRepository
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 1 Delete Cascade for RpmRepository: Checking relations for PulpImporterRepository
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 1: delete records from PulpImporterRepository
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted 0 records from PulpImporterRepository
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulpcore.app.models.publication.Distribution'>: on_delete=<function SET_NULL at 0x7f648a33f280>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Executing SET NULL constraint action on Distribution relation of RpmRepository
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Updated 1 records in Distribution
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulp_rpm.app.models.distribution.Addon'>: on_delete=<function RESTRICT at 0x7f648a33f160>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Relation from <class 'pulp_rpm.app.models.repository.RpmRepository'> to <class 'pulp_rpm.app.models.distribution.Variant'>: on_delete=<function RESTRICT at 0x7f648a33f160>
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Level 0: delete records from RpmRepository
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.app.django_util:INFO: Deleted (3, {'rpm.RpmRepository': 1, 'core.Repository': 1, 'core.UserRole': 1})
pulp [be07879d2708488ea6cde1ea189f18aa]: pulpcore.tasking.tasks:INFO: Task completed 018e0b90-d4ea-7d36-9a21-d4ca99fa1f68
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking was that we could easily and fast determine that we would not delete a single thing before even starting the rest of the process. But maybe it's not worth the trouble.
pulpcore/app/tasks/repository.py
Outdated
@@ -236,3 +236,24 @@ def add_and_remove(repository_pk, add_content_units, remove_content_units, base_ | |||
with repository.new_version(base_version=base_version) as new_version: | |||
new_version.remove_content(models.Content.objects.filter(pk__in=remove_content_units)) | |||
new_version.add_content(models.Content.objects.filter(pk__in=add_content_units)) | |||
|
|||
|
|||
def custom_cascade_delete_task(instance_id, app_label, serializer_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this is more "generic" than "custom". And I think i like this method better.
LOG.info( | ||
f"Level {level} Delete Cascade for {base_model.__name__}: " | ||
"Checking relations for {from_model.__name__}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to debug
Django handles cascade deletes itself (as opposed to letting the database handle it) and in doing so it pulls a lot of data out, resulting in a memory spike on repository deletion. The only solution is to write a custom cascade delete that deletes the objects at the bottom of the cascade tree more incrementally. [noissue]
# rather than letting it be triggered as we delete the repository | ||
try: | ||
instance.invalidate_cache() | ||
except AttributeError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we OK with this?
- Should we rename
invalidate_cache()
so as to eliminate any possibility of name collision - Should we list specific models (e.g. Repository) that we know have this method and call it only on those
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -236,3 +236,40 @@ def add_and_remove(repository_pk, add_content_units, remove_content_units, base_ | |||
with repository.new_version(base_version=base_version) as new_version: | |||
new_version.remove_content(models.Content.objects.filter(pk__in=remove_content_units)) | |||
new_version.add_content(models.Content.objects.filter(pk__in=add_content_units)) | |||
|
|||
|
|||
def generic_cascade_delete_task(instance_id, app_label, serializer_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdellweg This is fine yeah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the most generic now. 👍
related_pk_values, | ||
base_model=base_model, | ||
level=level + 1, | ||
skip_relations=skip_relations, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we using "skip_relations" anywhere? Does this translate nicely to the cascaded model?
Should we maybe pass it down like version__content
on Repository
--> content
on RepositoryVersion
?
app_label = instance._meta.app_label | ||
|
||
task = dispatch( | ||
tasks.repository.generic_cascade_delete_task, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe move it to task.base
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider toggling the definition behind a setting so that, should any problems be uncovered, there's an easy way to revert to the previous behavior without reverting or waiting for a new release?
I need to pick this back up and get it over the finish line. |
Django handles cascade deletes itself (as opposed to letting the database handle it) and in doing so it pulls a lot of data out, resulting in a memory spike on repository deletion.
The only solution is to write a custom cascade delete that deletes the objects at the bottom of the cascade tree more incrementally.
closes #2280