-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added celery support #209
base: master
Are you sure you want to change the base?
Added celery support #209
Conversation
photologue/forms.py
Outdated
if self.cleaned_data['gallery']: | ||
logger.debug('Using pre-existing gallery.') | ||
gallery = self.cleaned_data['gallery'] | ||
if USE_CELERY: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it might be more consistent to always save to the ZipUploadModel model instance - and to avoid the different behaviour between synchronous and asynchronous modes. It could also help to make the code shorter.
photologue/models.py
Outdated
USE_CELERY = getattr(settings, 'PHOTOLOGUE_USE_CELERY', False) | ||
|
||
# When using celery, where do we want to temporarily store our zip-file after upload? | ||
TEMP_ZIP_STORAGE = getattr(settings, 'PHOTOLOGUE_TEMP_ZIP_STORAGE', FileSystemStorage()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of FileSystemStorage will break the S3 compatibility. Well I think it's not working very well at the moment, but this will make it even more broken. We need a way to abstract away which file storage backend is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually use FileSystemStorage() on AWS with elastic beanstalk.
When a developer wants to use S3 instead for zip-file uploads (for example when he wants to allow uploads of very large files, or he uses lambda or something similar) he can just set TEMP_ZIP_STORAGE to his default storage. I think this would need to be made clear in the documentation though.
I decided to set it to FileSystemStorage by default, because the file doesn't need to exist very long and you need quick access to it. After thinking about it though, it might make more sense to set its default to default_storage, since you'd otherwise get problems in environments where the worker hasn't got access to the same filesystem as the webserver.
I also think S3-compatibility was fixed by a PR 2 weeks ago. (#211)
I haven't tested it, but this is the exact fix I used to get it working on S3 on my site (The fix was already mentioned in the comments of some old issue in this repo)
if os.path.dirname(filename): | ||
logger.warning('Ignoring file "{}" as it is in a subfolder; all images should be in the top ' | ||
'folder of the zip.'.format(filename)) | ||
if request: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At present we call the UploadZipForm from the admin, so there's always a request
object passed as an argument. So if there's a problem, the user receives feedback on screen.
How do you propose to warn the user if there is an issue when we are using Celery? Is there a mechanism in Celery to provide exit values, that we could potentially read and display?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this as well and I thought about 3 solutions, which I think all aren't the greatest
-
Using channels to send error messages to the user. Two reasons why I don't really like this idea:
- It is unnecessarily complex and introduces a new dependency on django channels
- Messages are only received if the user is on the right page at the right time if a developer only implements the default pages. Although he could subscribe to the channel on every page of his application if he so chooses. In that case, this would bring the best end-user experience I think.
-
Saving errors to the database and then we've got 2 ways to show these errors to the user
- Writing some kind of custom middleware to read these from the database and show messages if there are any (not a great solution, this means two extra db-calls for every page-load of every user -> one for retrieving and one for deleting)
- Only showing these error-messages on some pages used in the admin-dashboard. And if the developer wants to show these messages on other pages as well, he'll have to add the messages on these pages himself. (we could define some function show_upload_errors_as_messages(request))
The last solution seems the best of the 3 solutions according to me. But it'd be great if someone had an even better idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback. I agree that none of the solutions are great. It's somewhat complicated because we don't know what is installed on the Django project in which Photologue will be used. If the project already uses channels - great, use channels. Otherwise, it's not a good idea.
I like your suggestion of saving async feedback messages in a database table and showing them in the admin. This also gives the developer flexibility: if they want to use middleware to display feedback messages, it's easy to implement.
A suggestion: if the developer plans to use channels: would it be useful to put a hook
in the Photologue messages code, so that a developer can add a custom output (e.g. channels) if they wish for their project?
Hi Andreas, |
Just made a new commit based on your comments. All current tests pass (python 3.8) |
_parse_zip(zip_upload_model.zip_file, zip_upload_model.gallery, zip_upload_model.title, zip_upload_model.description, | ||
zip_upload_model.caption, | ||
zip_upload_model.is_public, request=request) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an advantage to having handle_zip
that just calls another function?
Hi, |
This PR adds celery support.
Getting a response after uploading a zip-file is way quicker. -> Less server time-outs
This is how the upload works when celery is used:
It can be used by setting PHOTOLOGUE_USE_CELERY = True.
By default, when using celery, the zip-file is saved on the local disk, since it doesn't need to be there for long and we want quick access.
This location can be changed by setting PHOTOLOGUE_TEMP_ZIP_STORAGE = SomeStorage().
67 tests from the master fail on my computer, and after this commit, the same tests fail.
It's probably best if you run the tests yourself.
I'm currently using a similar version of this in my own testing environment where I duplicated the code, and so far haven't found any issues.
Should I write tests for the celery update? This means that celery should be installed when running tests in the future. (To be honest I have no clue how to write tests for celery, but I can look it up)