Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The program is stuck. #6

Open
suyanzhou626 opened this issue Dec 4, 2018 · 15 comments
Open

The program is stuck. #6

suyanzhou626 opened this issue Dec 4, 2018 · 15 comments

Comments

@suyanzhou626
Copy link

(pytorch-0.41) <phd-1@kbkb541-server pytorch-segmentation-toolbox>$CUDA_VISIBLE_DEVICES=0,1,2,3 sh ./run_local.sh /media/phd-1/syz/OCNet/dataset/cityscapes
Linux kbkb541-server 4.15.0-39-generic #42~16.04.1-Ubuntu SMP Wed Oct 24 17:09:54 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
2018年 12月 04日 星期二 17:25:46 CST
ResNet(
(conv1): Conv2d(3, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu1): ReLU()
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu2): ReLU()
(conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn3): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu3): ReLU()
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=True)
(relu): ReLU()
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
(downsample): Sequential(
(0): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(2): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
)
(layer2): Sequential(
(0): Bottleneck(
(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
)
)
(1): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(2): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(3): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
)
(layer3): Sequential(
(0): Bottleneck(
(conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
(downsample): Sequential(
(0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
)
)
(1): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(2): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(3): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(4): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(5): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(6): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(7): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(8): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(9): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(10): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(11): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(12): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(13): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(14): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(15): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(16): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(17): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(18): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(19): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(20): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(21): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(22): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
)
(layer4): Sequential(
(0): Bottleneck(
(conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(4, 4), dilation=(4, 4), bias=False)
(bn2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
(downsample): Sequential(
(0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
)
)
(1): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(4, 4), dilation=(4, 4), bias=False)
(bn2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
(2): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(4, 4), dilation=(4, 4), bias=False)
(bn2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none)
(relu): ReLU()
(relu_inplace): ReLU(inplace)
)
)
(head): Sequential(
(0): PSPModule(
(stages): ModuleList(
(0): Sequential(
(0): AdaptiveAvgPool2d(output_size=(1, 1))
(1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
)
(1): Sequential(
(0): AdaptiveAvgPool2d(output_size=(2, 2))
(1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
)
(2): Sequential(
(0): AdaptiveAvgPool2d(output_size=(3, 3))
(1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
)
(3): Sequential(
(0): AdaptiveAvgPool2d(output_size=(6, 6))
(1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
)
)
(bottleneck): Sequential(
(0): Conv2d(4096, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
(2): Dropout2d(p=0.1)
)
)
(1): Conv2d(512, 19, kernel_size=(1, 1), stride=(1, 1))
)
(dsn): Sequential(
(0): Conv2d(1024, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01)
(2): Dropout2d(p=0.1)
(3): Conv2d(512, 19, kernel_size=(1, 1), stride=(1, 1))
)
)
/home/phd-1/.conda/envs/pytorch-0.41/lib/python3.6/site-packages/torch/nn/functional.py:52: UserWarning: size_average and reduce args will be deprecated, please use reduction='elementwise_mean' instead.
warnings.warn(warning.format(ret))
321300 images are loaded!

it don't continue, why? I think may be because of the InPlaceABNSync. how I can slove it?

@speedinghzl
Copy link
Owner

Hi, @suyanzhou626 I cannot find the problem from your information. Please make sure your data loader can access the images and labels firstly.

@lxtGH
Copy link

lxtGH commented Dec 9, 2018

@suyanzhou626 Hi, I met the same problem with you, have you solved it ?

@lzrobots
Copy link

lzrobots commented Jan 7, 2019

Same problem here but my program was stuck after the printing of iteration 1. No error printing.
Running on 4X 12g GPUs (three XP, one V) with batch 4X2=8.

If I reduce the batch to 4X1=4 with default crop size 769, or
default batch 4X2=8 with smaller crop size 761,
the program goes well.

So it seems the memory problem but there is no out of memory error and just only stuck there.
@speedinghzl Any thought?
Thanks.

@speedinghzl
Copy link
Owner

@lzrobots Yes, it is caused by out of memory. Is it TITAN V in the first position in your server? If so, you can change the order of GPU ids (e.g. 1,0,2,3) to solve this problem. Or you can run this repo with 761 and it does not affect the final result.

@lzrobots
Copy link

lzrobots commented Jan 7, 2019

Yes solved. Thanks!

@aasharma90
Copy link

aasharma90 commented Jan 12, 2019

I am facing the same problem as @lzrobots. Tried BS=8 with INPUT_SIZE=[769, 769] or [761, 761], the simulation is stuck after iteration1. I've 4x12G 1080 Ti GPUs.
With a smaller BS, say BS=4, the program runs well, but I'm afraid that it may affect the final performance. Any suggestions here @speedinghzl ?

[EDIT] Stuck with even lower input size, [713, 713]. Only lowering BS seems to help. Any workaround please?

@speedinghzl
Copy link
Owner

1080Ti only has 11G memory, you can try to lower batch size. But I think it will affect the performance (~77% rather than ~78%).

@aasharma90
Copy link

Hi @speedinghzl , thanks for your swift response. Yes, the available memory is around 11G only. I think I can manage with that much performance difference, so I will proceed with a lower BS.

Thanks for your help!

@speedinghzl speedinghzl reopened this Jan 12, 2019
@aasharma90
Copy link

Hi @speedinghzl ,

Just for your information, changing BS=4 while keeping everything as it was, I got a MIU of ~75.8%.

image

@speedinghzl
Copy link
Owner

speedinghzl commented Jan 14, 2019

When you set BS=4, you should increase the iterations from 40K to 80K. Then you can increase the input size to take ~11G memory.

@aasharma90
Copy link

aasharma90 commented Jan 14, 2019

Hi @speedinghzl ,

Your suggestions make sense. I'll try them out now and update you with the outcome. Thanks for your help!

@d-li14
Copy link

d-li14 commented Jan 15, 2019

@speedinghzl Sorry, I have the same issue, my program is even stuck in 4xTitan Xp with the default settings.

@aasharma90
Copy link

aasharma90 commented Jan 15, 2019

Hi @speedinghzl,
I get a score of 76.22% changing STEPS to 80k from 40k (keeping BS=4, and everything else as it is)
image
I did not try with a larger input size, but it also seems an options worth trying since there is still some memory left that can be used. Thanks for your help!

Hi @d-li14 ,
Could you try running with BS=4 and STEPS=80k, and see if they solve your problem? You can see the reported performance numbers for your reference.

@d-li14
Copy link

d-li14 commented Jan 15, 2019

@aasharma90 Thanks for your kind advice! Shrinking the batch size can definitely fit the model into GPU memory with ease, but we have to sacrifice the performance as demonstrated in your experiments (even unable to reproduce the original DeepLab result, significantly lower than the reported 78.9%).

I am curious that as stated by the author, 4x12g VRAM will be enough to run the script successfully, but in my case, it seems not to work. So any helpful advice? @speedinghzl

@d-li14
Copy link

d-li14 commented Jan 17, 2019

Actually, 4x12g VRAM is not enough. I have run the run_local.sh with 4 Tesla M40 GPUs, the memory usage of GPU 0 is over 12800M. Without modifying any default settings of this script, I got the final mIoU 77.4%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants