The program is stuck. #6

suyanzhou626 · 2018-12-04T09:28:40Z

(pytorch-0.41) <phd-1@kbkb541-server pytorch-segmentation-toolbox>$CUDA_VISIBLE_DEVICES=0,1,2,3 sh ./run_local.sh /media/phd-1/syz/OCNet/dataset/cityscapes
Linux kbkb541-server 4.15.0-39-generic #42~16.04.1-Ubuntu SMP Wed Oct 24 17:09:54 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
2018年 12月 04日星期二 17:25:46 CST
ResNet(
(conv1): Conv2d(3, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu1): ReLU()
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu2): ReLU()
(conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn3): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu3): ReLU()
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=True)
(relu): ReLU()
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
(downsample): Sequential(
(0): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(2): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
)
(layer2): Sequential(
(0): Bottleneck(
(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
)
)
(1): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(2): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(3): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
)
(layer3): Sequential(
(0): Bottleneck(
(conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
(downsample): Sequential(
(0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
)
)
(1): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(2): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(3): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(4): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(5): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(6): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(7): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(8): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(9): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(10): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(11): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(12): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(13): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(14): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(15): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(16): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(17): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(18): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(19): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(20): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(21): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(22): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
)
(layer4): Sequential(
(0): Bottleneck(
(conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(4, 4), dilation=(4, 4), bias=False)
(bn2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
(downsample): Sequential(
(0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
)
)
(1): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(4, 4), dilation=(4, 4), bias=False)
(bn2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(2): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(4, 4), dilation=(4, 4), bias=False)
(bn2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
)
(head): Sequential(
(0): PSPModule(
(stages): ModuleList(
(0): Sequential(
(0): AdaptiveAvgPool2d(output_size=(1, 1))
(1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
)
(1): Sequential(
(0): AdaptiveAvgPool2d(output_size=(2, 2))
(1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
)
(2): Sequential(
(0): AdaptiveAvgPool2d(output_size=(3, 3))
(1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
)
(3): Sequential(
(0): AdaptiveAvgPool2d(output_size=(6, 6))
(1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
)
)
(bottleneck): Sequential(
(0): Conv2d(4096, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
(2): Dropout2d(p=0.1)
)
)
(1): Conv2d(512, 19, kernel_size=(1, 1), stride=(1, 1))
)
(dsn): Sequential(
(0): Conv2d(1024, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
(2): Dropout2d(p=0.1)
(3): Conv2d(512, 19, kernel_size=(1, 1), stride=(1, 1))
)
)
/home/phd-1/.conda/envs/pytorch-0.41/lib/python3.6/site-packages/torch/nn/functional.py:52: UserWarning: size_average and reduce args will be deprecated, please use reduction='elementwise_mean' instead.
warnings.warn(warning.format(ret))
321300 images are loaded!

it don't continue, why? I think may be because of the InPlaceABNSync. how I can slove it?

speedinghzl · 2018-12-05T17:01:29Z

Hi, @suyanzhou626 I cannot find the problem from your information. Please make sure your data loader can access the images and labels firstly.

lxtGH · 2018-12-09T05:47:44Z

@suyanzhou626 Hi, I met the same problem with you, have you solved it ?

lzrobots · 2019-01-07T21:27:26Z

Same problem here but my program was stuck after the printing of iteration 1. No error printing.
Running on 4X 12g GPUs (three XP, one V) with batch 4X2=8.

If I reduce the batch to 4X1=4 with default crop size 769, or
default batch 4X2=8 with smaller crop size 761,
the program goes well.

So it seems the memory problem but there is no out of memory error and just only stuck there.
@speedinghzl Any thought?
Thanks.

speedinghzl · 2019-01-07T21:41:00Z

@lzrobots Yes, it is caused by out of memory. Is it TITAN V in the first position in your server? If so, you can change the order of GPU ids (e.g. 1,0,2,3) to solve this problem. Or you can run this repo with 761 and it does not affect the final result.

lzrobots · 2019-01-07T22:00:56Z

Yes solved. Thanks!

aasharma90 · 2019-01-12T07:07:40Z

I am facing the same problem as @lzrobots. Tried BS=8 with INPUT_SIZE=[769, 769] or [761, 761], the simulation is stuck after iteration1. I've 4x12G 1080 Ti GPUs.
With a smaller BS, say BS=4, the program runs well, but I'm afraid that it may affect the final performance. Any suggestions here @speedinghzl ?

[EDIT] Stuck with even lower input size, [713, 713]. Only lowering BS seems to help. Any workaround please?

speedinghzl · 2019-01-12T07:14:27Z

1080Ti only has 11G memory, you can try to lower batch size. But I think it will affect the performance (~77% rather than ~78%).

aasharma90 · 2019-01-12T07:17:09Z

Hi @speedinghzl , thanks for your swift response. Yes, the available memory is around 11G only. I think I can manage with that much performance difference, so I will proceed with a lower BS.

Thanks for your help!

aasharma90 · 2019-01-14T04:24:14Z

Hi @speedinghzl ,

Just for your information, changing BS=4 while keeping everything as it was, I got a MIU of ~75.8%.

speedinghzl · 2019-01-14T04:29:01Z

When you set BS=4, you should increase the iterations from 40K to 80K. Then you can increase the input size to take ~11G memory.

aasharma90 · 2019-01-14T04:32:59Z

Hi @speedinghzl ,

Your suggestions make sense. I'll try them out now and update you with the outcome. Thanks for your help!

d-li14 · 2019-01-15T03:08:56Z

@speedinghzl Sorry, I have the same issue, my program is even stuck in 4xTitan Xp with the default settings.

aasharma90 · 2019-01-15T05:38:51Z

Hi @speedinghzl,
I get a score of 76.22% changing STEPS to 80k from 40k (keeping BS=4, and everything else as it is)

I did not try with a larger input size, but it also seems an options worth trying since there is still some memory left that can be used. Thanks for your help!

Hi @d-li14 ,
Could you try running with BS=4 and STEPS=80k, and see if they solve your problem? You can see the reported performance numbers for your reference.

d-li14 · 2019-01-15T08:48:31Z

@aasharma90 Thanks for your kind advice! Shrinking the batch size can definitely fit the model into GPU memory with ease, but we have to sacrifice the performance as demonstrated in your experiments (even unable to reproduce the original DeepLab result, significantly lower than the reported 78.9%).

I am curious that as stated by the author, 4x12g VRAM will be enough to run the script successfully, but in my case, it seems not to work. So any helpful advice? @speedinghzl

d-li14 · 2019-01-17T10:25:24Z

Actually, 4x12g VRAM is not enough. I have run the run_local.sh with 4 Tesla M40 GPUs, the memory usage of GPU 0 is over 12800M. Without modifying any default settings of this script, I got the final mIoU 77.4%.

speedinghzl closed this as completed Dec 16, 2018

speedinghzl reopened this Jan 12, 2019

speedinghzl mentioned this issue Apr 1, 2019

the program is stucked before anything. #40

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The program is stuck. #6

The program is stuck. #6

suyanzhou626 commented Dec 4, 2018

speedinghzl commented Dec 5, 2018

lxtGH commented Dec 9, 2018

lzrobots commented Jan 7, 2019

speedinghzl commented Jan 7, 2019

lzrobots commented Jan 7, 2019

aasharma90 commented Jan 12, 2019 •

edited

speedinghzl commented Jan 12, 2019

aasharma90 commented Jan 12, 2019

aasharma90 commented Jan 14, 2019

speedinghzl commented Jan 14, 2019 •

edited

aasharma90 commented Jan 14, 2019 •

edited

d-li14 commented Jan 15, 2019

aasharma90 commented Jan 15, 2019 •

edited

d-li14 commented Jan 15, 2019

d-li14 commented Jan 17, 2019

The program is stuck. #6

The program is stuck. #6

Comments

suyanzhou626 commented Dec 4, 2018

speedinghzl commented Dec 5, 2018

lxtGH commented Dec 9, 2018

lzrobots commented Jan 7, 2019

speedinghzl commented Jan 7, 2019

lzrobots commented Jan 7, 2019

aasharma90 commented Jan 12, 2019 • edited

speedinghzl commented Jan 12, 2019

aasharma90 commented Jan 12, 2019

aasharma90 commented Jan 14, 2019

speedinghzl commented Jan 14, 2019 • edited

aasharma90 commented Jan 14, 2019 • edited

d-li14 commented Jan 15, 2019

aasharma90 commented Jan 15, 2019 • edited

d-li14 commented Jan 15, 2019

d-li14 commented Jan 17, 2019

aasharma90 commented Jan 12, 2019 •

edited

speedinghzl commented Jan 14, 2019 •

edited

aasharma90 commented Jan 14, 2019 •

edited

aasharma90 commented Jan 15, 2019 •

edited