Choose either 'long' or 'short' options for the resize anchor edge if the size variable is scalar #8358

sidijju · 2024-03-28T00:00:06Z

🚀 The feature

Choose either 'long' or 'short' options for the resize anchor edge if the size variable is scalar

Motivation, pitch

torchvision.transforms.Resize() does not provide a clean interface for resizing images based off the longer edge.

Consider the following use case - a user wants to resize a set of images such that the dimensions are constrained by size, e.g. the longer edge of the images is always equal to size. Consider two images of size [1000, 500] and [500, 1000]. We want to resize both such that the maximum dimension is 500, e.g. resize the first image to [500, 250]and the second to [250, 500].

The naive method approach would be to set size = 500. As noted in the docs,

If size is an int, smaller edge of the image will be matched to this number.

But in both our cases, the smaller edge of the image is already 500 so this essentially does nothing.

Setting max_size = 500 also doesn't solve the issue since the current implementation specifically doesn't allow max_size == size in the code. While we could select a value for size that is less than max_size, there's no clear way to pick a value of size that would result in the desired effect.

Right now there's no clean way to resize images based solely off the size of the longer edge. Adding the ability to pick the resize anchor edge would allow this.

Alternatives

No response

Additional context

A similar comment was made in #2868, but it seems like the discussion about the longer edge was lost in the final implementation

The text was updated successfully, but these errors were encountered:

NicolasHug · 2024-04-19T11:35:57Z

Thanks for the feature request @sidijju .

In order to get [500, 250] and [250, 500] from these specific input images, you could set size=499, max_size=500. But of course, this isn't a great UX, and it might not be possible to find a size value that would satisfy all input images.

There's been discussion of adding an edge parameter in the past but the parameters of resize are already fairly confusing. It seems that if we were to allow size=None, max_size=500 we could implement the behaviour you are looking for and this should cover all of the potential use cases:

size=tuple -> resize to fixed size
size=int -> resize shorter edge to size while preserving aspect ratio
size=int, max_size=int -> try to resize shorter edge to size while preserving aspect ratio but if resulting longer edge exceeds max_size, then scale down. This corresponds to the resizing strategy of some detection models.
size=None, max_size -> resize longer edge to max_size while preserving aspect ratio.

The first 3 are already implemented, the last one isn't. Any thoughts @pmeier @vfdev-5 ?

sidijju · 2024-04-23T06:21:03Z

Thanks for the reply @NicolasHug! I agree that setting size=499, max_size=500 would work for this set of input images, but I'm not sure about the effect this would have on a more varied dataset. I also agree that it isn't the best UX since it's not very intuitive.

I think the proposal for a size=None option is a good stopgap for now. If others agree and I have some guidance from more experienced contributors, I can attempt to implement this feature.

NicolasHug · 2024-04-29T09:54:18Z

Thanks for your feedback @sidijju . I'm happy to review a PR from you if you'd like to try to submit one. Our contributing guide is here: https://github.com/pytorch/vision/blob/main/CONTRIBUTING.md

For this specific change you'd only need to update torchvision.transforms.v2 (transform class, PIL functional and tensor functional). No need to change the "v1" transforms, i.e. the stuff in torchvision.transforms

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose either 'long' or 'short' options for the resize anchor edge if the size variable is scalar #8358

Choose either 'long' or 'short' options for the resize anchor edge if the size variable is scalar #8358

sidijju commented Mar 28, 2024 •

edited

NicolasHug commented Apr 19, 2024

sidijju commented Apr 23, 2024

NicolasHug commented Apr 29, 2024

Choose either 'long' or 'short' options for the resize anchor edge if the size variable is scalar #8358

Choose either 'long' or 'short' options for the resize anchor edge if the size variable is scalar #8358

Comments

sidijju commented Mar 28, 2024 • edited

🚀 The feature

Motivation, pitch

Alternatives

Additional context

NicolasHug commented Apr 19, 2024

sidijju commented Apr 23, 2024

NicolasHug commented Apr 29, 2024

sidijju commented Mar 28, 2024 •

edited