Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds OUTPUT_PADDING to ConvTrans2D #890

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

swfsql
Copy link
Contributor

@swfsql swfsql commented Nov 16, 2023

Closes #889.

  • The only difference was basically adding the OUTPUT_PADDING to the output size calculation (to the Convolved Const value, etc).
  • Although I'm marking this as "ready for review", this is in a draft state because I can't say if this is correct. This needs a reviewer who understand the conv inner workings.
    • Still, a quick and simple forward test from pytorch gives the same result from the test.
      • Note: Tensorflow result differs when testing for output_padding=1, both from dfdx and from pytorch.
    • AFAIK the backprop also works ok, but I haven't added an explicit test for that.
  • Unsure where to add the generic parameter. On pytorch both on documentation and overall parameter ordering, the output_padding appears right after padding, but in this PR I've added it as a "new" (last) parameter, ie. after groups.
    • This should still break code that uses ConvTrans2D as a direct model (as a built model), because when this happens the generated structure already gets the <E, D> generic parameters as last parameters, and the OUTPUT_PADDING would come before that.

Added test:

#[rustfmt::skip]
#[test]
fn test_forward_output_padding() {
let dev: TestDevice = Default::default();
let x = dev.tensor([[[[0.1, 0.7], [0.3, 0.4]]]]);
let w = dev.tensor([[[[-0.1, -0.3, 0.7], [0.8, -0.2, 0.1], [0.3, 0.4, -0.5]]]]);
let mut m = dev
.build_module::<TestDtype>(<ConvTrans2DConstConfig<1, 1, 3, 2, 1, 1, 1, 0>>::default());
m.weight = w.clone();
let y: Tensor<Rank4<1, 1, 3, 3>, _, _, _> = m.forward(x.clone());
assert_close_to_literal!(y,[[[[-0.02, 0.57, -0.14], [-0.05, 0.33, 0.16,], [-0.06, 0.35000002, -0.08000001]]]]);
let mut m = dev
.build_module::<TestDtype>(<ConvTrans2DConstConfig<1, 1, 3, 2, 1, 1, 1, 1>>::default());
m.weight = w.clone();
let y: Tensor<Rank4<1, 1, 4, 4>, _, _, _> = m.forward(x.clone());
assert_close_to_literal!(
y, [[[
[-0.0200, 0.5700, -0.1400, 0.0700],
[-0.0500, 0.3300, 0.1600, -0.0700],
[-0.0600, 0.3500, -0.0800, 0.0400],
[0.1200, -0.0300, 0.1600, -0.2000],
]]]
);
}

Reference pytorch test:

import torch
import numpy as np

x = np.array([[[[0.1, 0.7], [0.3, 0.4]]]])
w = np.array([[[[-0.1, -0.3, 0.7], [0.8, -0.2, 0.1], [0.3, 0.4, -0.5]]]])

a = torch.nn.ConvTranspose2d(output_padding=0, in_channels=1, out_channels=1, kernel_size=3, stride=2, padding=1, bias = False)
b = torch.nn.ConvTranspose2d(output_padding=1, in_channels=1, out_channels=1, kernel_size=3, stride=2, padding=1, bias = False)

x = torch.from_numpy(x).float()
w0 = torch.from_numpy(w).float()

with torch.no_grad():
    a.weight = torch.nn.Parameter(w0)
    b.weight = torch.nn.Parameter(w0)

ya = a(x)
yb = b(x)

print(ya.size()) # torch.Size([1, 1, 3, 3])
print(yb.size()) # torch.Size([1, 1, 4, 4])

print(ya)
print(yb)

Reference tensorflow test (which differs) :

import tensorflow as tf
import numpy as np

x = np.array([[[[0.1, 0.7], [0.3, 0.4]]]])
w = [np.array([[[[-0.1]],
        [[-0.3]],
        [[ 0.7 ]]],
       [[[ 0.8 ]],
        [[-0.2]],
        [[ 0.1]]],
       [[[ 0.3 ]],
        [[ 0.4 ]],
        [[-0.5]]]]), np.array([0.])]

print(x.shape) # (1, 1, 2, 2)

a = tf.keras.layers.Conv2DTranspose(output_padding=0, filters=1, kernel_size=3, strides=2, padding='same', data_format='channels_first')
b = tf.keras.layers.Conv2DTranspose(output_padding=1, filters=1, kernel_size=3, strides=2, padding='same', data_format='channels_first')

ya = a(x).numpy()
yb = b(x).numpy()

a.set_weights(w)
b.set_weights(w)
ya = a(x).numpy()
yb = b(x).numpy()

print(ya.shape) # (1, 1, 3, 3)
print(yb.shape) # (1, 1, 4, 4)

print(ya) # the ya is the same for both torch and tf
print(yb) # the yb is different between torch and tf

# in torch the top-left from ya is the same from yb
# in tf the bottom-right from ya is the same from yb

@swfsql swfsql marked this pull request as ready for review November 17, 2023 02:14
rainiwu and others added 8 commits January 26, 2024 00:29
- Draft state.
- Unsure if correct, but a very simple and quick test gives the same
  result from pytorch.
- Note: Tensorflow result differs, both from dfdx and from pytorch.

Reference pytorch test:
```python
import torch

x = np.array([[[[0.1, 0.7], [0.3, 0.4]]]])
w = np.array([[[[-0.1, -0.3, 0.7], [0.8, -0.2, 0.1], [0.3, 0.4, -0.5]]]])

a = torch.nn.ConvTranspose2d(output_padding=0, in_channels=1, out_channels=1, kernel_size=3, stride=2, padding=1, bias = False)
b = torch.nn.ConvTranspose2d(output_padding=1, in_channels=1, out_channels=1, kernel_size=3, stride=2, padding=1, bias = False)

x = torch.from_numpy(x).float()
w0 = torch.from_numpy(w).float()

with torch.no_grad():
    a.weight = torch.nn.Parameter(w0)
    b.weight = torch.nn.Parameter(w0)

ya = a(x)
yb = b(x)

print(ya.size()) # torch.Size([1, 1, 3, 3])
print(yb.size()) # torch.Size([1, 1, 4, 4])

print(ya)
print(yb)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add OUTPUT_PADDING to ConvTrans2D
2 participants