Adds `OUTPUT_PADDING` to `ConvTrans2D` #890

swfsql · 2023-11-16T05:41:35Z

Closes #889.

The only difference was basically adding the OUTPUT_PADDING to the output size calculation (to the Convolved Const value, etc).
Although I'm marking this as "ready for review", this is in a draft state because I can't say if this is correct. This needs a reviewer who understand the conv inner workings.
- Still, a quick and simple forward test from pytorch gives the same result from the test.
  - Note: Tensorflow result differs when testing for output_padding=1, both from dfdx and from pytorch.
- AFAIK the backprop also works ok, but I haven't added an explicit test for that.
Unsure where to add the generic parameter. On pytorch both on documentation and overall parameter ordering, the output_padding appears right after padding, but in this PR I've added it as a "new" (last) parameter, ie. after groups.
- This should still break code that uses ConvTrans2D as a direct model (as a built model), because when this happens the generated structure already gets the <E, D> generic parameters as last parameters, and the OUTPUT_PADDING would come before that.

Added test:

Lines 260 to 284 in 07a6653

    
           #[rustfmt::skip] 
        
           #[test] 
        
           fn test_forward_output_padding() { 
        
               let dev: TestDevice = Default::default(); 
        
               let x = dev.tensor([[[[0.1, 0.7], [0.3, 0.4]]]]); 
        
               let w = dev.tensor([[[[-0.1, -0.3, 0.7], [0.8, -0.2, 0.1], [0.3, 0.4, -0.5]]]]); 
        
               let mut m = dev 
        
                   .build_module::<TestDtype>(<ConvTrans2DConstConfig<1, 1, 3, 2, 1, 1, 1, 0>>::default()); 
        
               m.weight = w.clone(); 
        
               let y: Tensor<Rank4<1, 1, 3, 3>, _, _, _> = m.forward(x.clone()); 
        
               assert_close_to_literal!(y,[[[[-0.02, 0.57, -0.14], [-0.05, 0.33, 0.16,], [-0.06, 0.35000002, -0.08000001]]]]); 
        
               let mut m = dev 
        
                   .build_module::<TestDtype>(<ConvTrans2DConstConfig<1, 1, 3, 2, 1, 1, 1, 1>>::default()); 
        
               m.weight = w.clone(); 
        
               let y: Tensor<Rank4<1, 1, 4, 4>, _, _, _> = m.forward(x.clone()); 
        
               assert_close_to_literal!( 
        
                   y, [[[ 
        
                       [-0.0200, 0.5700, -0.1400, 0.0700], 
        
                       [-0.0500, 0.3300, 0.1600, -0.0700], 
        
                       [-0.0600, 0.3500, -0.0800, 0.0400], 
        
                       [0.1200, -0.0300, 0.1600, -0.2000], 
        
                   ]]] 
        
               ); 
        
           }

Reference pytorch test:

import torch
import numpy as np

x = np.array([[[[0.1, 0.7], [0.3, 0.4]]]])
w = np.array([[[[-0.1, -0.3, 0.7], [0.8, -0.2, 0.1], [0.3, 0.4, -0.5]]]])

a = torch.nn.ConvTranspose2d(output_padding=0, in_channels=1, out_channels=1, kernel_size=3, stride=2, padding=1, bias = False)
b = torch.nn.ConvTranspose2d(output_padding=1, in_channels=1, out_channels=1, kernel_size=3, stride=2, padding=1, bias = False)

x = torch.from_numpy(x).float()
w0 = torch.from_numpy(w).float()

with torch.no_grad():
    a.weight = torch.nn.Parameter(w0)
    b.weight = torch.nn.Parameter(w0)

ya = a(x)
yb = b(x)

print(ya.size()) # torch.Size([1, 1, 3, 3])
print(yb.size()) # torch.Size([1, 1, 4, 4])

print(ya)
print(yb)

Reference tensorflow test (which differs) :

import tensorflow as tf
import numpy as np

x = np.array([[[[0.1, 0.7], [0.3, 0.4]]]])
w = [np.array([[[[-0.1]],
        [[-0.3]],
        [[ 0.7 ]]],
       [[[ 0.8 ]],
        [[-0.2]],
        [[ 0.1]]],
       [[[ 0.3 ]],
        [[ 0.4 ]],
        [[-0.5]]]]), np.array([0.])]

print(x.shape) # (1, 1, 2, 2)

a = tf.keras.layers.Conv2DTranspose(output_padding=0, filters=1, kernel_size=3, strides=2, padding='same', data_format='channels_first')
b = tf.keras.layers.Conv2DTranspose(output_padding=1, filters=1, kernel_size=3, strides=2, padding='same', data_format='channels_first')

ya = a(x).numpy()
yb = b(x).numpy()

a.set_weights(w)
b.set_weights(w)
ya = a(x).numpy()
yb = b(x).numpy()

print(ya.shape) # (1, 1, 3, 3)
print(yb.shape) # (1, 1, 4, 4)

print(ya) # the ya is the same for both torch and tf
print(yb) # the yb is different between torch and tf

# in torch the top-left from ya is the same from yb
# in tf the bottom-right from ya is the same from yb

Remove ftz

Avoid ci errors

- Draft state. - Unsure if correct, but a very simple and quick test gives the same result from pytorch. - Note: Tensorflow result differs, both from dfdx and from pytorch. Reference pytorch test: ```python import torch x = np.array([[[[0.1, 0.7], [0.3, 0.4]]]]) w = np.array([[[[-0.1, -0.3, 0.7], [0.8, -0.2, 0.1], [0.3, 0.4, -0.5]]]]) a = torch.nn.ConvTranspose2d(output_padding=0, in_channels=1, out_channels=1, kernel_size=3, stride=2, padding=1, bias = False) b = torch.nn.ConvTranspose2d(output_padding=1, in_channels=1, out_channels=1, kernel_size=3, stride=2, padding=1, bias = False) x = torch.from_numpy(x).float() w0 = torch.from_numpy(w).float() with torch.no_grad(): a.weight = torch.nn.Parameter(w0) b.weight = torch.nn.Parameter(w0) ya = a(x) yb = b(x) print(ya.size()) # torch.Size([1, 1, 3, 3]) print(yb.size()) # torch.Size([1, 1, 4, 4]) print(ya) print(yb) ```

swfsql marked this pull request as ready for review November 17, 2023 02:14

rainiwu and others added 8 commits January 26, 2024 00:29

remove deprecated ftz intrinsics

5c532ec

suppress spurious cargo clippy warning

fb91f13

Merge pull request #1 from rainiwu/remove-ftz

24a8593

Remove ftz

avoid conv1d bound for cudnn

4e3f7c7

bump gemm

a8bc54c

clippy fix

557687c

Merge pull request #2 from swfsql/avoid-ci-errors

1175903

Avoid ci errors

swfsql force-pushed the issue-889 branch from 07a6653 to e81228c Compare March 1, 2024 16:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds `OUTPUT_PADDING` to `ConvTrans2D` #890

Adds `OUTPUT_PADDING` to `ConvTrans2D` #890

swfsql commented Nov 16, 2023 •

edited

	#[rustfmt::skip]
	#[test]
	fn test_forward_output_padding() {
	let dev: TestDevice = Default::default();
	let x = dev.tensor([[[[0.1, 0.7], [0.3, 0.4]]]]);
	let w = dev.tensor([[[[-0.1, -0.3, 0.7], [0.8, -0.2, 0.1], [0.3, 0.4, -0.5]]]]);
	let mut m = dev
	.build_module::<TestDtype>(<ConvTrans2DConstConfig<1, 1, 3, 2, 1, 1, 1, 0>>::default());
	m.weight = w.clone();
	let y: Tensor<Rank4<1, 1, 3, 3>, _, _, _> = m.forward(x.clone());
	assert_close_to_literal!(y,[[[[-0.02, 0.57, -0.14], [-0.05, 0.33, 0.16,], [-0.06, 0.35000002, -0.08000001]]]]);

	let mut m = dev
	.build_module::<TestDtype>(<ConvTrans2DConstConfig<1, 1, 3, 2, 1, 1, 1, 1>>::default());
	m.weight = w.clone();
	let y: Tensor<Rank4<1, 1, 4, 4>, _, _, _> = m.forward(x.clone());
	assert_close_to_literal!(
	y, [[[
	[-0.0200, 0.5700, -0.1400, 0.0700],
	[-0.0500, 0.3300, 0.1600, -0.0700],
	[-0.0600, 0.3500, -0.0800, 0.0400],
	[0.1200, -0.0300, 0.1600, -0.2000],
	]]]
	);
	}

Adds OUTPUT_PADDING to ConvTrans2D #890

Are you sure you want to change the base?

Adds OUTPUT_PADDING to ConvTrans2D #890

Conversation

swfsql commented Nov 16, 2023 • edited

Adds `OUTPUT_PADDING` to `ConvTrans2D` #890

Adds `OUTPUT_PADDING` to `ConvTrans2D` #890

swfsql commented Nov 16, 2023 •

edited