New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Implement "same" padding for convolution operations? #3867
Comments
This seems worth doing. What is the interface you are proposing? like |
Note yhat if you are looking for the same behavior of TensorFlow, the implementation will not be that straighforward, because the number of pixels to add depend on the input size. See https://github.com/caffe2/caffe2/blob/master/caffe2/proto/caffe2_legacy.proto for reference |
Thank you for indicating the issue and the reference. |
I think the biggest source of inefficiency will come from the fact that we will need to add a This could be made more efficient if we supported different paddings for each side of the convolution (like in Caffe2, so left, right, top, bottom), but cudnn still doesn't support that so we would require the extra padding in those cases. Also, I think if we add the I think what bothers me a bit is that users might expect the behavior of What do you think? |
Why would that be inefficient? couldn't we just compute the padding at every forward step? the cost should be tiny, so there's no need to optimize that. Maybe I don't fully understand the semantics, but I can't see why |
making padding dependent on input size is quite bad. We just had an internal discussion about this, with @Yangqing outlining why this is a bad idea for a variety of serialization and efficiency reasons. |
@fmassa, what I intended was to calculate "constant" padding shape in EDIT: Then what I suggested should be a function and placed in, say, torch.nn.utils or torch.utils. |
In result, what I suggest is a simple utility function, like (pseudocode): def calc_pad_conv1d(width, padding='same', check_symmetric=True, ... <params that conv1d has>):
shape = <calculate padding>
assert not check_symmetric or <shape is symmetric>, \
'Calculated padding shape is asymmetric, which is not supported by conv1d. ' \
'If you just want to get the value, consider using check_symmetric=False.'
return shape
width = 100 # for example
padding = calc_pad_conv1d(width, ...)
m = nn.Conv1d(..., padding=padding) Also, The function could be used with |
@qbx2 maybe I don't understand fully your proposal, but if we want to replicate TensorFlow behavior I don't think this is enough. Here is a snippet of what I think mimics TensorFlow def conv2d_same_padding(input, weight, bias=None, stride=1, dilation=1, groups=1):
input_rows = input.size(2)
filter_rows = weight.size(2)
effective_filter_size_rows = (filter_rows - 1) * dilation[0] + 1
out_rows = (input_rows + stride[0] - 1) // stride[0]
padding_needed =
max(0, (out_rows - 1) * stride[0] + effective_filter_size_rows -
input_rows)
padding_rows = max(0, (out_rows - 1) * stride[0] +
(filter_rows - 1) * dilation[0] + 1 - input_rows)
rows_odd = (padding_rows % 2 != 0)
# same for padding_cols
if rows_odd or cols_odd:
input = F.pad(input, [0, int(cols_odd), 0, int(rows_odd)])
return F.conv2d(input, weight, bias, stride,
padding=(padding_rows // 2, padding_cols // 2),
dilation=dilation, groups=groups)
It was mostly copy-pasted from TensorFlow code in here and here. As you can see, there is a lot of hidden things going on there, and that's why I think it might not be worth it adding a Thoughts? |
@fmassa Yes, you're right. It may be inefficient to calculate the padding on every However, my proposal is NOT to calculate the padding every For example, think the case that a researcher has images with 200x200, 300x300, 400x400. Then he/she can calculate paddings for the three cases in the initialization phase and just pass the images to >>> import torch
>>> import torch.nn as nn
>>> from torch.autograd import Variable
>>> m = nn.Conv2d(1,1,1)
>>> m(Variable(torch.randn(1,1,2,2))).shape
torch.Size([1, 1, 2, 2])
>>> m.padding = (1, 1)
>>> m(Variable(torch.randn(1,1,2,2))).shape
torch.Size([1, 1, 4, 4]) Yes, I just want to add the "padding calculating utility function" in pytorch core. When the researcher wants dependent padding on each input image size, he/she can combine the function with |
Is there any plan of implementing a similar api in pytorch in the near future? People coming from a tensorflow / keras background will certainly appreciate it. |
So, a basic padding calculation strategy (which does not gives the same results as TensorFlow, but the shapes are similar) is to have def _get_padding(padding_type, kernel_size):
assert padding_type in ['SAME', 'VALID']
if padding_type == 'SAME':
return tuple((k - 1) // 2 for k in kernel_size))
return tuple(0 for _ in kernel_size) Is that what you have in mind @im9uri ? |
It's similar to what I had in mind, but as you mentioned previously the calculation gets complicated with stride and dilation. Also having such an api in other convolution operations such as ConvTranspose2d would be great. |
I think that "sliding-window operators" should all support asymmetric padding. About the "same" argument... I know, it's not the cleanest solution but the constraint sounds reasonable enough to me given that:
|
conv2d documentation gives the explicit formulas for output sizes. Equating e.g. Hout with Hin one can solve for the padding:
|
Since same padding means padding = (kernel_size - stride)//2, what if padding = "same" is introduced such that when written, it automatically reads kernel size and stride (as that is also mentioned in nn.Conv2d) and applies padding automatically accordingly |
Here is a very simple Conv2d layer with class Conv2dSame(torch.nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, bias=True, padding_layer=torch.nn.ReflectionPad2d):
super().__init__()
ka = kernel_size // 2
kb = ka - 1 if kernel_size % 2 == 0 else ka
self.net = torch.nn.Sequential(
padding_layer((ka,kb,ka,kb)),
torch.nn.Conv2d(in_channels, out_channels, kernel_size, bias=bias)
)
def forward(self, x):
return self.net(x)
c = Conv2dSame(1,3,5)
print(c(torch.rand((16,1,10,10))).shape)
# torch.Size([16, 3, 10, 10]) |
If this is still being evaluated for being added to PyTorch, then regarding the tradeoffs between complexity / inefficiency vs. ease-of-use for developers: In the road to 1.0 blog post, it states:
Anecdotally, I come from a background of using Keras as well as the original If the "central goal" really is focused on usability, than I'd argue that even if there's an efficiency hit to computing zero-padding on every forward pass (as mentioned above), the time saved in terms of developer efficiency and maintainability (e.g. not having to write custom code to compute zero padding) may be worth the tradeoff. Thoughts? |
I would use this feature |
It doesn't make sense to me why can't an optional API of |
Yes, if someone can please add and approve this, it would be great. |
Definitely add this, conner wants it. |
Does pytorch support it now? Can it using same operation like first in VGG, set padding = (kernel_size-1)/2 ? |
Here is one example to call padding same conv2d from deepfakes:
|
Summary: Pull Request resolved: #45667 First part of #3867 (Pooling operators still to do) This adds a `padding='same'` mode to the interface of `conv{n}d`and `nn.Conv{n}d`. This should match the behaviour of `tensorflow`. I couldn't find it explicitly documented but through experimentation I found `tensorflow` returns the shape `ceil(len/stride)` and always adds any extra asymmetric padding onto the right side of the input. Since the `native_functions.yaml` schema doesn't seem to support strings or enums, I've moved the function interface into python and it now dispatches between the numerically padded `conv{n}d` and the `_conv{n}d_same` variant. Underscores because I couldn't see any way to avoid exporting a function into the `torch` namespace. A note on asymmetric padding. The total padding required can be odd if both the kernel-length is even and the dilation is odd. mkldnn has native support for asymmetric padding, so there is no overhead there, but for other backends I resort to padding the input tensor by 1 on the right hand side to make the remaining padding symmetrical. In these cases, I use `TORCH_WARN_ONCE` to notify the user of the performance implications. Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D27170744 Pulled By: jbschlosser fbshipit-source-id: b3d8a0380e0787ae781f2e5d8ee365a7bfd49f22
It is so stupid every time I use conv I should calculate the padding size. It wastes almost half the code writing time. You can write a whole paragraph saying SAME is not appropriate at only some time, but it is a common sense that you should give the user an option. |
Same padding for un-strided convolutions will be in the next realease, PyTorch 1.9. This works today on pytorch-nightly: >>> import torch
>>> module = torch.nn.Conv2d(1, 1, 5, padding='same')
>>> x = torch.randn(1, 1, 20, 20)
>>> module(x).shape
torch.Size([1, 1, 20, 20]) |
To get the conversation going, I propose the following variations for strided convolutions (i.e.
|
Removing high-pri since support is now in for un-strided convolutions. We can bump the priority back up if enough people want support for strided convolutions. |
It could be useful to check some recently released tensorflow generative models and check if they use strided same-convs. I think they do... |
A slightly different formula explained here with respect to |
Removing milestone tag for now |
SAME padding support was added to Hopefully different striding sizes will eventually be supported? |
So, I pulled out the same padding code here: https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/conv.py#L110-L123, tested it with a stride of 2 for a nn.Conv2d layer, and it appears to work properly.
This leads me to wonder why this error message exists if it's not accurate: https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/conv.py#L93-L94
Why is there an error message for same padding with a stride of 2, if it works correctly when the error message is removed? |
Reinstating high-pri due to activity. |
Doesn't it seem that the transposed convolution with |
I only tested with the nn.Conv2d layer, and at the time I was not aware of why the error message was so broad despite the functionality appearing to exist. The issue stems from a different in how TensorFlow implemented same padding vs what would be more logical, and that's why the error message exists in it's current form. |
@JakobHavtorn I don't think the currentl implementation works for import collections
from itertools import repeat
import torch
from torch import nn
import torch.nn.functional as F
def _ntuple(n):
"""Copied from PyTorch since it's not importable as an internal function
https://github.com/pytorch/pytorch/blob/v1.10.0/torch/nn/modules/utils.py#L6
"""
def parse(x):
if isinstance(x, collections.abc.Iterable):
return tuple(x)
return tuple(repeat(x, n))
return parse
_pair = _ntuple(2)
class Conv2dSame(nn.Module):
"""Manual convolution with same padding
Although PyTorch >= 1.10.0 supports ``padding='same'`` as a keyword argument,
this does not export to CoreML as of coremltools 5.1.0, so we need to
implement the internal torch logic manually. Currently the ``RuntimeError`` is
"PyTorch convert function for op '_convolution_mode' not implemented"
Also same padding is not supported for strided convolutions at the moment
https://github.com/pytorch/pytorch/blob/v1.10.0/torch/nn/modules/conv.py#L93
"""
def __init__(self, in_channels, out_channels, kernel_size, stride=1, dilation=1, **kwargs):
"""Wrap base convolution layer
See official PyTorch documentation for parameter details
https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html
"""
super().__init__()
self.conv = nn.Conv2d(
in_channels=in_channels,
out_channels=out_channels,
kernel_size=kernel_size,
stride=stride,
dilation=dilation,
**kwargs)
# Setup internal representations
kernel_size_ = _pair(kernel_size)
dilation_ = _pair(dilation)
self._reversed_padding_repeated_twice = [0, 0] * len(kernel_size_)
# Follow the logic from ``nn._ConvNd``
# https://github.com/pytorch/pytorch/blob/v1.10.0/torch/nn/modules/conv.py#L116
for d, k, i in zip(dilation_, kernel_size_, range(len(kernel_size_) - 1, -1, -1)):
total_padding = d * (k - 1)
left_pad = total_padding // 2
self._reversed_padding_repeated_twice[2 * i] = left_pad
self._reversed_padding_repeated_twice[2 * i + 1] = (
total_padding - left_pad)
def forward(self, imgs):
"""Setup padding so same spatial dimensions are returned
All shapes (input/output) are ``(N, C, W, H)`` convention
:param torch.Tensor imgs:
:return torch.Tensor:
"""
padded = F.pad(imgs, self._reversed_padding_repeated_twice)
return self.conv(padded)
conv_same = Conv2dSame(3, 8, 3)
conv_same_stride = Conv2dSame(3, 8, 3, stride=2)
imgs = torch.randn(1, 3, 28, 28)
print(conv_same(imgs).shape) # Correct [1, 8, 28, 28]
print(conv_same_stride(imgs).shape) # Wrong [1, 8, 14, 14] |
@addisonklinke Ah yes I get your point. I believe this comes down to what you expect. Same padding is defined to keep the input and output shapes equal for unit strides. I often find it useful to be able to downsample an input some integer number of times in a "same padded" manner. By this I mean that the length of the output along the convolved dimension(s) is exactly
where So in your example, I actually find that the output has exactly the shape I would expect from a same-padded convolution with non-unit stride. |
@JakobHavtorn Ah interesting, I hadn't thought about it like that! |
@addisonklinke : We just ran into this coremltools issue too, so I filed it at apple/coremltools#1363. But meanwhile, i'll try your workaround, which makes some sense. Thanks for sharing it! |
@addisonklinke's code worked for me, and did it without using math.ceil or math.floor, which makes it compatible with (the current) TensorRT release. I was getting all sorts of errors working with other implementations that were using ceil and floor, where it wasn't being calculated the same way in TensorRT (or Torch_TensorRT) as it was in Torch/Torchscript. Awesome work! I also implemented this for MaxPool2D, which was mostly a copy/paste, and seems to work: class MaxPool2dStaticSamePadding(nn.Module):
"""
The real keras/tensorflow MaxPool2d with same padding
"""
def __init__(self, *args, **kwargs):
super().__init__()
self.pool = nn.MaxPool2d(*args, **kwargs)
self.stride = self.pool.stride
self.kernel_size = self.pool.kernel_size
if isinstance(self.stride, int):
self.stride = [self.stride] * 2
elif len(self.stride) == 1:
self.stride = [self.stride[0]] * 2
if isinstance(self.kernel_size, int):
self.kernel_size = [self.kernel_size] * 2
elif len(self.kernel_size) == 1:
self.kernel_size = [self.kernel_size[0]] * 2
# Setup internal representations
kernel_size_ = _pair(self.kernel_size)
dilation_ = _pair(1)
self._reversed_padding_repeated_twice = [0, 0] * len(kernel_size_)
# Follow the logic from ``nn._ConvNd``
# https://github.com/pytorch/pytorch/blob/v1.10.0/torch/nn/modules/conv.py#L116
for d, k, i in zip(dilation_, kernel_size_, range(len(kernel_size_) - 1, -1, -1)):
total_padding = d * (k - 1)
left_pad = total_padding // 2
self._reversed_padding_repeated_twice[2 * i] = left_pad
self._reversed_padding_repeated_twice[2 * i + 1] = (
total_padding - left_pad)
def forward(self, x):
"""Setup padding so same spatial dimensions are returned
All shapes (input/output) are ``(N, C, W, H)`` convention
:param torch.Tensor imgs:
:return torch.Tensor:
"""
padded = F.pad(x, self._reversed_padding_repeated_twice)
x = self.pool(padded)
return x |
As of now, PyTorch's implementation of 'same' is different from SciPy/NumPy! I would like to point out that the convention used by torch (and tensorflow) for when the kernel ('weight') is even-length is the opposite of what is used in scipy and numpy. Looking through the thread above, I don't see any discussion of why this might be. Here are some code examples demonstrating this:
result
|
The implementation would be easy, but could help many people suffered from the headache of calculating how many padding they need.
cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @albanD @mruberry @walterddr
The text was updated successfully, but these errors were encountered: