Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added celery support #209

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

AndreasMilants
Copy link
Contributor

This PR adds celery support.

Getting a response after uploading a zip-file is way quicker. -> Less server time-outs

This is how the upload works when celery is used:

  1. Send the form to the server
  2. The form + zip_file are saved and a task to create the photos and their sizes is created
  3. A response is sent to the user
  4. One by one, photos will start appearing on the admin dashboard when they are saved

It can be used by setting PHOTOLOGUE_USE_CELERY = True.
By default, when using celery, the zip-file is saved on the local disk, since it doesn't need to be there for long and we want quick access.
This location can be changed by setting PHOTOLOGUE_TEMP_ZIP_STORAGE = SomeStorage().

67 tests from the master fail on my computer, and after this commit, the same tests fail.
It's probably best if you run the tests yourself.

I'm currently using a similar version of this in my own testing environment where I duplicated the code, and so far haven't found any issues.

Should I write tests for the celery update? This means that celery should be installed when running tests in the future. (To be honest I have no clue how to write tests for celery, but I can look it up)

@coveralls
Copy link

Coverage Status

Coverage remained the same at 67.331% when pulling b6c0267 on AndreasMilants:master into d48745c on richardbarran:master.

6.0.0 Outdated Show resolved Hide resolved
if self.cleaned_data['gallery']:
logger.debug('Using pre-existing gallery.')
gallery = self.cleaned_data['gallery']
if USE_CELERY:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it might be more consistent to always save to the ZipUploadModel model instance - and to avoid the different behaviour between synchronous and asynchronous modes. It could also help to make the code shorter.

photologue/forms.py Outdated Show resolved Hide resolved
USE_CELERY = getattr(settings, 'PHOTOLOGUE_USE_CELERY', False)

# When using celery, where do we want to temporarily store our zip-file after upload?
TEMP_ZIP_STORAGE = getattr(settings, 'PHOTOLOGUE_TEMP_ZIP_STORAGE', FileSystemStorage())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of FileSystemStorage will break the S3 compatibility. Well I think it's not working very well at the moment, but this will make it even more broken. We need a way to abstract away which file storage backend is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually use FileSystemStorage() on AWS with elastic beanstalk.
When a developer wants to use S3 instead for zip-file uploads (for example when he wants to allow uploads of very large files, or he uses lambda or something similar) he can just set TEMP_ZIP_STORAGE to his default storage. I think this would need to be made clear in the documentation though.
I decided to set it to FileSystemStorage by default, because the file doesn't need to exist very long and you need quick access to it. After thinking about it though, it might make more sense to set its default to default_storage, since you'd otherwise get problems in environments where the worker hasn't got access to the same filesystem as the webserver.

I also think S3-compatibility was fixed by a PR 2 weeks ago. (#211)
I haven't tested it, but this is the exact fix I used to get it working on S3 on my site (The fix was already mentioned in the comments of some old issue in this repo)

photologue/models.py Outdated Show resolved Hide resolved
if os.path.dirname(filename):
logger.warning('Ignoring file "{}" as it is in a subfolder; all images should be in the top '
'folder of the zip.'.format(filename))
if request:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At present we call the UploadZipForm from the admin, so there's always a request object passed as an argument. So if there's a problem, the user receives feedback on screen.
How do you propose to warn the user if there is an issue when we are using Celery? Is there a mechanism in Celery to provide exit values, that we could potentially read and display?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this as well and I thought about 3 solutions, which I think all aren't the greatest

  • Using channels to send error messages to the user. Two reasons why I don't really like this idea:

    1. It is unnecessarily complex and introduces a new dependency on django channels
    2. Messages are only received if the user is on the right page at the right time if a developer only implements the default pages. Although he could subscribe to the channel on every page of his application if he so chooses. In that case, this would bring the best end-user experience I think.
  • Saving errors to the database and then we've got 2 ways to show these errors to the user

    1. Writing some kind of custom middleware to read these from the database and show messages if there are any (not a great solution, this means two extra db-calls for every page-load of every user -> one for retrieving and one for deleting)
    2. Only showing these error-messages on some pages used in the admin-dashboard. And if the developer wants to show these messages on other pages as well, he'll have to add the messages on these pages himself. (we could define some function show_upload_errors_as_messages(request))

The last solution seems the best of the 3 solutions according to me. But it'd be great if someone had an even better idea.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I agree that none of the solutions are great. It's somewhat complicated because we don't know what is installed on the Django project in which Photologue will be used. If the project already uses channels - great, use channels. Otherwise, it's not a good idea.
I like your suggestion of saving async feedback messages in a database table and showing them in the admin. This also gives the developer flexibility: if they want to use middleware to display feedback messages, it's easy to implement.
A suggestion: if the developer plans to use channels: would it be useful to put a hook in the Photologue messages code, so that a developer can add a custom output (e.g. channels) if they wish for their project?

@richardbarran
Copy link
Owner

Hi Andreas,
Thanks for the merge request. I've started reading it - it opens up new possibilities for Photologue, and for making processing more generic. I'd like to make this change as clean and as robust as possible, so I've made a lot of comments on your code. Please view them as constructive comments rather than as criticism. I look forward to reading your modified code.

@AndreasMilants
Copy link
Contributor Author

Just made a new commit based on your comments. All current tests pass (python 3.8)

_parse_zip(zip_upload_model.zip_file, zip_upload_model.gallery, zip_upload_model.title, zip_upload_model.description,
zip_upload_model.caption,
zip_upload_model.is_public, request=request)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an advantage to having handle_zip that just calls another function?

@richardbarran
Copy link
Owner

richardbarran commented Sep 26, 2020

Hi,
Thanks for the updated code - I have added my comments above. I have a thought: it will be really useful add in the documentation instructions on how to use this new feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants