Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5228-query-instead-of-signals #5469

Merged
merged 29 commits into from
May 14, 2024

Conversation

devincowan
Copy link
Contributor

@devincowan devincowan commented Apr 29, 2024

Partially resolves #5228 by directly querying resource file size at runtime instead of caching used storage size in the UserQuota model

This PR alters are quota system to remove reliance on the iRods microservices.
More info is available in
https://github.com/hydroshare/hydroshare/blob/5228-query-instead-of-signals/quota-info.md

Resolves #5228 in conjunction with #5229
Relies on #5443 in order to get accurate file sizes from irods storage
Resolves #5418 by entirely removing the need for irods microservice in the dataZone
Resolves #5235 because removing the irods microservice will vastly improve performance of filesystem operations
Resolves #5441 because quota will be refreshed before sending emails and we can be confident in correct values
Resolves #5384 because we don't need a microservice anymore
Resolves #5195 because we don't need a microservice anymore
Resolves #5286 because we don't need a microservice anymore

Pull Request Checklist:

  • Positive Test Case Written by Dev
  • Automated Testing
  • Sufficient User and Developer Documentation
  • Passing Jenkins Build
  • Peer Code review and approval

Positive Test Case

  1. Suggest that you open a private window and login as admin, then open the userQuotas model in the django admin panel
  2. Login as non-admin user in the non-private window
  3. With that user, conduct the following operations
  • uploading a file to a resource
  • creating a resource
  • copying a resource
  • versioning a resource
  • transferring ownership of a resource to "user"
  • ingesting a bag from the userZone
  • zipping a file without removing the original
  • unzipping a file
  • creating aggregations
  • moving files into subdirectories
  1. In all cases, verify that the user's quota (viewed in the private browser, admin panel) adjusts in the expected way, either increasing, decreasing, or remaining the same.

@devincowan devincowan force-pushed the 5228-query-instead-of-signals branch from a5abc34 to d9740a1 Compare April 29, 2024 16:38
Copy link
Member

@pkdash pkdash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Left few comments. The main one is the new model field quota_holder that needs change.

@@ -2178,6 +2178,9 @@ class MyResourceContentType(pages.Page, hs_core.AbstractResource):
# allow resource that contains spam_patterns to be discoverable/public
spam_allowlisted = models.BooleanField(default=False)

quota_holder = models.ForeignKey(User, on_delete=models.CASCADE, null=True, blank=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to delete all resources when a user is deleted (on_delete=models.CASCADE)? We probably want to use on_delete=models.SET_NULL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for this Pabitra! I'm thinking maybe we use SET()
https://docs.djangoproject.com/en/3.2/ref/models/fields/#django.db.models.SET
added that, let me know what you think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I realized that SET() doesn't seem to take any args so I will use SET_NULL as you suggested

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you make this function as an instance method of the AbstractResource class?.
81c5e6b#diff-3bc7182059367a2c6cf9e96428d08a0ce66f709baec5d00090abd19683582891R2112

Copy link
Contributor Author

@devincowan devincowan May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pkdash! I tried like this:

class AbstractResource(ResourcePermissionsMixin, ResourceIRODSMixin):
    quota_holder = models.ForeignKey(User, on_delete=models.SET(self.get_new_quota_holder), null=True, blank=True,
                                     related_name='quota_holder')

    def get_new_quota_holder(self):
        """Get a new quota holder for a resource."""
        if self.raccess.owners.count() > 0:
            return self.raccess.owners.first()
        else:
            return None

But results:

hydroshare | File "/hydroshare/hs_core/models.py", line 2181, in AbstractResource
hydroshare | quota_holder = models.ForeignKey(User, on_delete=models.SET(self.get_new_quota_holder), null=True, blank=True,
hydroshare | NameError: name 'self' is not defined

It looks like SET() needs a callable, and there is a good mention of the fact that the callable can't run any queries when models.py gets imported:
https://docs.djangoproject.com/en/3.2/ref/models/fields/#django.db.models.SET

Set the ForeignKey to the value passed to SET(), or if a callable is passed in, the result of calling it. In most cases, passing a callable will be necessary to avoid executing queries at the time your models.py is imported:

...maybe there is another way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I have never used models.SET() before. Seems like it has quite a few restrictions. I guess we can set it to None and for now and deal with it later to update it using management command. Do we ever delete users? What about a user being inactive? If we really want to set another user as quota holder upon user delete, maybe we can use pre delete signal for user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh pre-delete signal is a good call. AFAIK we don't ever delete users, we make them inactive. I think it's okay for an inactive user to retain their quota.

I'll add a pre-delete signal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added signal in 0747594

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might not even matter because in my testing, deleting user also deletes all of their resources.
I don't see this cascade in the code...maybe it is
https://github.com/hydroshare/hydroshare/blob/master/hs_core/models.py#L2119-L2124

return uqDataZoneSize if uqDataZoneSize is not None else 0


def get_user_zone_usage(username):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not returning the quota size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we didn't had to depend on the iRODS AVU to get this value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch thank you!
In this PR we get rid of iRods AVUs for the Datazone.
Similar to you, I wish we would get rid of the AVUs for the UserZone as well.
It would certainly be possible, but since the userzone is (:fingers-crossed) going away after our cloud migration, I decided not to invest that effort at this time.

hs_core/hydroshare/resource.py Outdated Show resolved Hide resolved
theme/models.py Outdated Show resolved Hide resolved
theme/models.py Outdated
from hs_core.hydroshare.resource import get_quota_usage

if refresh:
# TODO get_quota_usage will run the dz query, but then it gets run again below...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Planning to fix this TODO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that is resolved. Removed the TODO

@@ -98,7 +102,7 @@ def handle(self, *args, **options):
owned_resources = user.uaccess.owned_resources
total_size = 0
for res in owned_resources:
if res.get_quota_holder() == user:
if res.quota_holder == user:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since quota_holder is a model field now, we can filter on it - no need for the if check

Copy link
Contributor Author

@devincowan devincowan May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call thanks! added that filter

hs_core/tasks.py Outdated
@@ -509,6 +509,12 @@ def send_over_quota_emails():
if uq:
used_percent = uq.used_percent
if used_percent >= qmsg.soft_limit_percent:
# first update the quota just to be sure that we are using the latest figures
update_quota_usage(u.username)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function update_quota_usage() also sends email notifications. Would we end up sending duplicate emails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a param notify_user so that we can suppress messages to the users. Thanks!

hs_core/tests/test_update_quota_usage.py Show resolved Hide resolved
@devincowan devincowan force-pushed the 5228-query-instead-of-signals branch 3 times, most recently from 5f31c99 to e22ae70 Compare May 6, 2024 13:57
@devincowan devincowan force-pushed the 5228-hard-quota-enforcement branch from 1bbae82 to 8f54dc1 Compare May 9, 2024 20:38
@devincowan devincowan force-pushed the 5228-query-instead-of-signals branch from 48871d4 to 7c89614 Compare May 9, 2024 20:39
@devincowan devincowan linked an issue May 9, 2024 that may be closed by this pull request
38 tasks
@devincowan devincowan removed a link to an issue May 9, 2024
38 tasks
This commit adds a new management command `reset_quota.py` that allows for resetting the userzone quota by forcing the quota iRODS microservices to recalculate the quota for all users. This command is useful in scenarios where the quota needs to be recalculated due to changes in the system.
@devincowan devincowan force-pushed the 5228-hard-quota-enforcement branch from 87d37f9 to f4b85b8 Compare May 9, 2024 22:51
@devincowan devincowan force-pushed the 5228-query-instead-of-signals branch from 636297c to a21d785 Compare May 9, 2024 22:51
@pkdash
Copy link
Member

pkdash commented May 13, 2024

@devincowan Will try to finish reviewing by tomorrow.

hs_core/migrations/0082_merge_20240509_1646.py Outdated Show resolved Hide resolved
user = instance
for res in BaseResource.objects.filter(quota_holder=user):
other_owners = None
if hasattr(res, 'raccess') and res.raccess is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this check necessary?

Copy link
Member

@pkdash pkdash May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are not deleting user, I am not considering making suggestions to optimize queries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess maybe that check isn't necessary. I'm not remembering now why I added that... removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it back in. See test failures here:
http://ci.hydroshare.org:8080/job/hydroshare-pull-requests/7611/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fine to have that check. However, the reason I think the tests failed when you removed that check is because of the delete order here - the ResourceAccess records are deleted before User records:
https://github.com/hydroshare/hydroshare/blob/develop/hs_access_control/tests/utilities.py#L35
https://github.com/hydroshare/hydroshare/blob/develop/hs_access_control/tests/utilities.py#L36

If you swap the delete order the tests are likely to pass. Again, I am not suggesting you need to make any changes for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right you are! Thanks Pabitra!

The migrations `0080_baseresource_quota_holder.py`, `0080_resourcefile_filesize_cache_updated.py`, `0081_alter_baseresource_quota_holder.py`, and `0082_merge_20240509_1646.py` are no longer needed and have been deleted. These migrations were related to the `baseresource` model's `quota_holder` field and the `resourcefile` model's `filesize_cache_updated` field. The changes made in these migrations have been consolidated into a new migration `0080_baseresource_quota_holder_squashed_0082_merge_20240509_1646.py`, which adds both fields to their respective models.
@devincowan devincowan merged commit 22e9bc5 into 5228-hard-quota-enforcement May 14, 2024
1 check passed
@devincowan devincowan deleted the 5228-query-instead-of-signals branch May 14, 2024 12:46
@devincowan devincowan mentioned this pull request May 14, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants