Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bilinear interpolation behavior inconsistent with TF, CoreML and Caffe #10604

Open
libfun opened this issue Aug 17, 2018 · 7 comments
Open

Bilinear interpolation behavior inconsistent with TF, CoreML and Caffe #10604

libfun opened this issue Aug 17, 2018 · 7 comments
Labels
module: interpolation triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@libfun
Copy link

libfun commented Aug 17, 2018

Issue description

Trying to compare and transfer models between Caffe, TF and Pytorch found difference in output of bilinear interpolations between all. Caffe is using depthwise transposed convolutions instead of straightforward resize, so it's easy to reimplement both in TF and Pytorch.
However, there is difference between output for TF and Pytorch with align_corners=False, which is default for both.

Code example

img = cv2.resize(cv2.imread('./lenna.png')[:, :, ::-1], (256, 256))
img = img.reshape(1, 256, 256, 3).astype('float32') / 255.
img = tf.convert_to_tensor(img)
output_size = [512, 512]
output = tf.image.resize_bilinear(img, output_size, align_corners=True)
with tf.Session() as sess:
        values = sess.run([output])
out_tf = values[0].astype('float32')[0]

img = img.reshape(1, 256, 256, 3).transpose(0, 3, 1, 2).astype('float32') / 255.
out_pt = nn.functional.interpolate(torch.from_numpy(nimg), 
                                   scale_factor=2, 
                                   mode='bilinear', 
                                   align_corners=True)
out_pt = out_pt.data.numpy().transpose(0, 2, 3, 1)[0]

print(np.max(np.abs(out_pt - out_tf)))
# output 5.6624413e-06

But

img = cv2.resize(cv2.imread('./lenna.png')[:, :, ::-1], (256, 256))
img = img.reshape(1, 256, 256, 3).astype('float32') / 255.
img = tf.convert_to_tensor(img)
output_size = [512, 512]
output = tf.image.resize_bilinear(img, output_size, align_corners=False)
with tf.Session() as sess:
        values = sess.run([output])
out_tf = values[0].astype('float32')[0]

img = img.reshape(1, 256, 256, 3).transpose(0, 3, 1, 2).astype('float32') / 255.
out_pt = nn.functional.interpolate(torch.from_numpy(nimg), 
                                   scale_factor=2, 
                                   mode='bilinear', 
                                   align_corners=False)
out_pt = out_pt.data.numpy().transpose(0, 2, 3, 1)[0]

print(np.max(np.abs(out_pt - out_tf)))
# output 0.22745097

Output diff * 10:
image

Output of CoreML is consistent with TF, so it seems that there is a bug with implementation of bilinear interpolation with align_corners=False in Pytorch.

Diff is reproducible both on cpu and cuda with cudnn 7.1, cuda 9.1.

@libfun libfun changed the title Bilinear interpolation behavior inconsistent with TF and Caffe Bilinear interpolation behavior inconsistent with TF, CoreML and Caffe Aug 17, 2018
@ssnl
Copy link
Collaborator

ssnl commented Aug 17, 2018

Thanks for the report. We will look into this!

@libfun
Copy link
Author

libfun commented Aug 17, 2018

I've run a few experiments and it seems that Pytorch with align_corners=False is consistent with python-opencv, PIL and skimage with mode='edge'. I'll look into CoreML closer today.

@ailzhang
Copy link
Contributor

For pytorch implementation, here's the logic to compute src_idx <-> dst_idx mapping.
https://github.com/pytorch/pytorch/blob/master/aten/src/THNN/generic/linear_upsampling.h#L22
For TF implementation, scale is caculated in the same way. https://github.com/tensorflow/tensorflow/blob/f66daa493e7383052b2b44def2933f61faf196e0/tensorflow/core/kernels/image_resizer_state.h#L41
But the src_idx <-> dst_idx mapping is different. https://github.com/tensorflow/tensorflow/blob/6795a8c3a3678fb805b6a8ba806af77ddfe61628/tensorflow/core/kernels/resize_bilinear_op.cc#L85
I think Pytorch takes into account the pixel center when scaling, while TF doesn't.

@baowenbo
Copy link

For pytorch implementation, here's the logic to compute src_idx <-> dst_idx mapping.
https://github.com/pytorch/pytorch/blob/master/aten/src/THNN/generic/linear_upsampling.h#L22
For TF implementation, scale is caculated in the same way. https://github.com/tensorflow/tensorflow/blob/f66daa493e7383052b2b44def2933f61faf196e0/tensorflow/core/kernels/image_resizer_state.h#L41
But the src_idx <-> dst_idx mapping is different. https://github.com/tensorflow/tensorflow/blob/6795a8c3a3678fb805b6a8ba806af77ddfe61628/tensorflow/core/kernels/resize_bilinear_op.cc#L85
I think Pytorch takes into account the pixel center when scaling, while TF doesn't.

So do you think there is any workaround in Pytorch to make it's interpolation consistent with that of TensorFlow?
I thought it may be possible to try torch.nn.functional.grid_sample() of Pytorch.

@AmitGarage
Copy link

For pytorch implementation, here's the logic to compute src_idx <-> dst_idx mapping.
https://github.com/pytorch/pytorch/blob/master/aten/src/THNN/generic/linear_upsampling.h#L22
For TF implementation, scale is caculated in the same way. https://github.com/tensorflow/tensorflow/blob/f66daa493e7383052b2b44def2933f61faf196e0/tensorflow/core/kernels/image_resizer_state.h#L41
But the src_idx <-> dst_idx mapping is different. https://github.com/tensorflow/tensorflow/blob/6795a8c3a3678fb805b6a8ba806af77ddfe61628/tensorflow/core/kernels/resize_bilinear_op.cc#L85
I think Pytorch takes into account the pixel center when scaling, while TF doesn't.

So do you think there is any workaround in Pytorch to make it's interpolation consistent with that of TensorFlow? I thought it may be possible to try torch.nn.functional.grid_sample() of Pytorch.

Using half_pixel_centers = True along with align_corners = False in tf.image.resize_bilinear will work same as torch.nn.functional.interpolate with align_corners = False

@TheDevilWillBeBee
Copy link

TheDevilWillBeBee commented Aug 20, 2022

This is a PyTorch implementation of tf.compat.v1.image.resize_bilinear function.
The implemented function is consistent with tf.compat.v1.image.resize_bilinear function when half_pixel_centers = False and with align_corners = False.
The default implementation in PyTorch is consistent with the Tensorflow implementation when half_pixel_centers = True.

import torch
import numpy as np

def tf_consistent_bilinear_upsample(imgs, scale_factor=1.0):
    b, c, h, w = imgs.shape
    assert h == w
    
    N = int(h * scale_factor)
    delta = (1.0 / h)
    p = int(scale_factor) - 1
    
    xs = torch.linspace(-1.0 + delta, 1.0 - delta, N - p)
    ys = torch.linspace(-1.0 + delta, 1.0 - delta, N - p)
    grid = torch.meshgrid(xs, ys)
    gridy = grid[1]
    gridx = grid[0]
    gridx = torch.nn.functional.pad(gridx.unsqueeze(0), (0, p, 0, p), mode='replicate')[0]
    gridy = torch.nn.functional.pad(gridy.unsqueeze(0), (0, p, 0, p), mode='replicate')[0]
    grid = torch.stack([gridy, gridx], dim=-1).unsqueeze(0).repeat(b, 1, 1, 1)
    output = torch.nn.functional.grid_sample(imgs, grid, mode='bilinear', padding_mode='zeros')
    
    return output

The code is tested and the results are consistent with Tensorflow when the scale factor is an integer value.

@980202006
Copy link

This is the correct solution, I tested the result on 1 dimension, the loss is less than 0.01.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: interpolation triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

8 participants