Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layer::Forward does not work when inference faster-rcnn #90

Open
spacegrass opened this issue Oct 28, 2018 · 6 comments
Open

Layer::Forward does not work when inference faster-rcnn #90

spacegrass opened this issue Oct 28, 2018 · 6 comments

Comments

@spacegrass
Copy link

I find the master code of layer::Forward has be changed like this:

inline void Layer::Forward(const vector<Blob*>& bottom,
const vector<Blob*>& top) {
switch (Caffe::mode()) {
case Caffe::CPU:
Forward_cpu(bottom, top);
break;
case Caffe::GPU:
Forward_gpu(bottom, top);
break;
default:
LOG(FATAL) << "Unknown caffe mode.";
}
}

There is no reshape() before real layer to do forward.
This change does not support faster-rcnn because this kind of network will reshap top layer by down layer at runtime.
I think this feature should be considered. Thanks.

@luoyetx
Copy link
Owner

luoyetx commented Oct 29, 2018

All reshape stuffs are done in PlaceMemory. When the shape of input blob changes, all internal blob will change the shape and realloc the memory buffer.

@spacegrass
Copy link
Author

spacegrass commented Nov 1, 2018

Yes. I have read the code. It will reshape every layer of the net BEFORE net forward. Some kind of net like faster-rcnn will change the down layer shape information at forward time. So I think PlaceMemory won't fix it.

@luoyetx
Copy link
Owner

luoyetx commented Nov 3, 2018

The layer itself gets all shape info about input blobs, it should be able to computer the shape of output blobs when reshape function called.

@luoyetx
Copy link
Owner

luoyetx commented Nov 3, 2018

For proposal layer, checkout code here. We set maximum shape for output rois.

@yanhn
Copy link

yanhn commented Dec 6, 2018

@luoyetx I got this strange problem when call Net::CopyTrainedLayersFrom under different
mode(caffe::GPU vs caffe CPU):
Error info is:

C:\workspace\opensource\mini-caffe\src\net.cpp:277: Cannot copy param 0 weights from layer '221'; shape mismatch. Source param shape is 1 64 1 1 (64); target param shape is 64 1 1 1 (64). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.

my last 2 layers has prototxt as below:
layer {
name: "221"
type: "Convolution"
bottom: "220"
top: "221"
convolution_param {
num_output: 1
bias_term: true
group: 1
pad: 0
kernel_size: 1
stride: 1
dilation: 1
}
}
layer {
name: "output"
type: "Sigmoid"
bottom: "221"
top: "output"
}

1:under cpu mode. everything's fine. My conv layer's weights target param has shape 1 x 64 x 1 x 1, with bias shape (1), can correctly load from caffemodel to caffe::Net.
2: under gpu mode, conv layer's target param weights has shape 64 x 1 x 1 x1, with bias shape (64), so I can't load params from caffemodel due to the mismatch between my caffemodel( which is corect) and caffe::Net from prototxt (which is wrong).
Have you any idea on this strange problem?

I have tried with add a special case into line 259 of net.cpp,
if (source_layer_name == "221") {
const bool kReshape = true;
target_blobs[j]->FromProto(source_layer.blobs(j), kReshape);
printf("after copy proto blob no.%d: shape is %s\n", j, target_blobs[j]->shape_string());
continue;
}
It helped me with Net::CopyTrainedLayersFrom, but when I do net.Forward(); same shape mismatch problem occurred again. The only difference in code is if I use different
mode(caffe::GPU vs caffe CPU).

@yanhn
Copy link

yanhn commented Dec 10, 2018

Recently I did some code reading and debug. The result shows that:
1、Net::Reshape() was called during Net::Forward, so the shape of my Convolution changed to mismatching status.
2、I put some log info in BaseConvolutionLayer::Reshape, and found other Convolution layers print the corresponding logs except for my last Convolution with kernel_size=1 and num_output=1. ( I only use this conv layer once for feature dimension reduction. )
3、Test with other models on other computers has the same problem: pycaffe ( gpu & cpu ) ok ; c++ cpu ok; c++ gpu not ok.
So is there some advice to help me out? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants