-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use imageio.v3 when read or write images #1956
base: master
Are you sure you want to change the base?
Conversation
@xunmeibuyue for the next time, there is no need to open a new PR for each update after a review comment. Simply you can push new commit to the same remote branch you pushed. In that way the context of the review will keep in a single place. Regarding the PR itself, we mandatorily need all the CI builds passing. So, could you review what have broken and fix it? You can follow the CONTRIBUTING.md doc to install the dependencies and run the test locally before push it. I look forward to your updates |
OK, I opened a new PR just for a cleaning commit, but it seems that keeping the context of the review in a single place will be more helpful. I can't believe a change of just few lines arises so much errors 😭, I will review and fix it later. Thanks for the patient comment! |
Looks like the tests are no longer being collected properly because pytest tries to load images from a container that stores multiple, differently-shaped images. The image in question is This is one of the changes in ImageIO v2 vs ImageIO v3: when reading A couple of ways forward, depending on what the test tries to accomplish:
Also, for this PR it might be worthwhile to have a look at our migration guide: https://imageio.readthedocs.io/en/stable/reference/userapi.html#migrating-to-the-v3-api On a more high level, MoviePy seems to use ImageIO for reading/writing images and to then rely on FFMPEG (I assume via subprocesses?) for reading/writing of video. ImageIO has supported video reading/writing since forever by calling an FFMPEG executable via a subprocess. Recently, (about 1 year ago) we added a second backend exclusive to v3 that can call directly into the underlying FFMPEG API via pyAV, and - as a result - gives performance on par with or exceeding OpenCV. To me, this sounds like there is an overlap between both projects, so I was wondering if it is worthwhile to investigate if we can combine some of the code, reduce duplication, and share some of the maintenance burdens in the process. |
Sorry, I just try to commit a change (follow the guidance of @FirefoxMetzger ) on the fork of mine, but it was wrongly submitted to here. Hence there is no need to run the actions @mgaitan and I'll update it once I finished the local test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some quick comments - and suggestions on where new API functions might be more useful :)
moviepy/video/VideoClip.py
Outdated
@@ -1040,7 +1040,7 @@ def __init__( | |||
|
|||
if not isinstance(img, np.ndarray): | |||
# img is a string or path-like object, so read it in from disk | |||
img = imread(img) | |||
img = imread(img, index=0)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
img = imread(img, index=0)[0] | |
img = imread(img, index=0) |
If you add a slice to the call you will get the first row of the image. I suppose you meant to just get the first frame/image in the file?
moviepy/video/VideoClip.py
Outdated
@@ -11,7 +11,7 @@ | |||
|
|||
import numpy as np | |||
import proglog | |||
from imageio import imread, imsave | |||
from imageio.v3 import imread, imwrite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While not strictly necessary, it is recommended to import Imageio as import imageio.v3 as iio
. This gives you access to the full API and reads nicer than only importing the methods used.
You would then consume it as iio,imwrite
and iio.imread
respectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out. I'll push a new commit
@@ -5,7 +5,7 @@ | |||
import os | |||
|
|||
import numpy as np | |||
from imageio import imread | |||
from imageio.v3 import imread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from imageio.v3 import imread | |
import imageio.v3 as iio |
Same here, I would recommend to import the entire namespace. This will give you access to some niceties (see below).
@@ -62,7 +62,7 @@ def __init__( | |||
if isinstance(sequence, list): | |||
if isinstance(sequence[0], str): | |||
if load_images: | |||
sequence = [imread(file) for file in sequence] | |||
sequence = [imread(file, index=0)[0] for file in sequence] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sequence = [imread(file, index=0)[0] for file in sequence] | |
sequence = [img for img in iio.imiter(file)] |
We now have imiter
which can iterate over containers and give you images in the order in which they appear in the file.
This not only reads nicer, but is also more performant because (1) you only need to open the file once, and (2) you tell ImageIO "I want to read more than one image, just not all at once". This way we can optimize for this scenario and do things like buffering future frames/images, carry decoder state over to the next frame, select more optimized threadding models, etc. In a nutshell: we can optimize for reading and serving images sequentially.
That said, it seems odd to read all images and then store them in a list, because it can be memory hungry. Where is sequence
consumed? Perhaps it is better to move the generator there and load images one at a time.
@@ -78,14 +78,14 @@ def __init__( | |||
|
|||
# check that all the images are of the same size | |||
if isinstance(sequence[0], str): | |||
size = imread(sequence[0]).shape | |||
size = imread(sequence[0], index=0)[0].shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size = imread(sequence[0], index=0)[0].shape | |
size = iio.improps(sequence[0], index=0).shape |
Another advantage of using the full namespace is that you get improps
which gives you standardized metadata (including the shape of the resulting image) without decoding pixels (if possible).
This needs less memory and is much faster since we only need to read a small header instead of a potentially large image.
else: | ||
size = sequence[0].shape | ||
|
||
for image in sequence: | ||
image1 = image | ||
if isinstance(image, str): | ||
image1 = imread(image) | ||
image1 = imread(image, index=0)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
image1 = imread(image, index=0)[0] | |
image1 = imread(image, index=0) |
@@ -117,20 +117,20 @@ def make_frame(t): | |||
index = find_image_index(t) | |||
|
|||
if index != self.last_index: | |||
self.last_image = imread(self.sequence[index])[:, :, :3] | |||
self.last_image = imread(self.sequence[index], index=0)[0][:, :, :3] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.last_image = imread(self.sequence[index], index=0)[0][:, :, :3] | |
self.last_image = imread(self.sequence[index], index=0)[:, :, :3] |
This looks a bit like we can push down the entire operation to ImageIO, but I lack context of the class itself. Does last_image
mean that you wish to read the last image in the file?
self.last_index = index | ||
|
||
return self.last_image | ||
|
||
if with_mask and (imread(self.sequence[0]).shape[2] == 4): | ||
if with_mask and (imread(self.sequence[0], index=0)[0].shape[2] == 4): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if with_mask and (imread(self.sequence[0], index=0)[0].shape[2] == 4): | |
if with_mask and iio.improps(self.sequence[0], index=0).shape[2] == 4: |
self.mask = VideoClip(is_mask=True) | ||
self.mask.last_index = None | ||
self.mask.last_image = None | ||
|
||
def mask_make_frame(t): | ||
index = find_image_index(t) | ||
if index != self.mask.last_index: | ||
frame = imread(self.sequence[index])[:, :, 3] | ||
frame = imread(self.sequence[index], index=0)[0][:, :, 3] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frame = imread(self.sequence[index], index=0)[0][:, :, 3] | |
frame = imread(self.sequence[index], index=0)[:, :, 3] |
@xunmeibuyue That's no problem at all. It's actually best practice to commit all your changes here even if the result is an intermediate or broken state. There is no judgement here (we all know that coding usually works in multiple steps) and having a more fine-grained history of commits makes it much easier to roll back to earlier work should the need ever arise. Also, GitHub will show us (reviewers) a summary of all your changes so when we come back for a review, we see the final result and not the steps in-between. (There are some repos like numba (JIT compiler for python code), where this practice is discouraged, but that is because they have a massive CI matrix of dozens of GPUs, CPUs, OSes, and python versions they need to test against. Repos like that will tell you in big BOLD letters that you shouldn't push everything, so unless you find something like that it's usually okay to develop against the PR branch ;) ) |
I just force-pushed a new branch before seeing your suggestion 😭, because I‘m a little bit neat freak on coding. I am really sorry. Thanks for the kind suggestion! |
The latest commit, which is based on the comments of @FirefoxMetzger, seems passing the pytest. |
@xunmeibuyue Looks like there were issues with the code formatting. |
Using
imageio.v3
instead ofimageio
when read or write images, which has the advantage thatimageio.v3
support more image format thanimageio
such as.webp
.tests/
cumented new or changed features in the documentation or in the docstrings
Fix #1894