Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

Why use sigmoid cross entropy instead of softmax in RPN? #406

Closed
twmht opened this issue May 2, 2018 · 3 comments
Closed

Why use sigmoid cross entropy instead of softmax in RPN? #406

twmht opened this issue May 2, 2018 · 3 comments

Comments

@twmht
Copy link

twmht commented May 2, 2018

Hi,

In the original faster rcnn, you used softmax loss when training rpn. (https://github.com/rbgirshick/py-faster-rcnn/blob/master/models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage1_rpn_train.pt#L447)

in FPN, you use sigmoid cross entropy to measure rpn loss.
(https://github.com/facebookresearch/Detectron/blob/master/lib/modeling/FPN.py#L459)

in my experiment, I found that the RPN recall dropped about 4 point when using sigmoid cross entropy.

So, why using sigmoid cross entropy in FPN? Have you ever tried softmax loss?

thanks!

@rbgirshick
Copy link
Contributor

Using softmax for binary classification is an over parameterization and shouldn't be necessary. When porting from py-faster-rcnn to detectron, I tried both softmax and sigmoid for RPN and obtained similar RPN recall. I have not revisited using softmax for RPN with FPN.

@twmht twmht closed this as completed May 6, 2018
@fw509
Copy link

fw509 commented May 25, 2018

@rbgirshick Is it possible to support Softmax in "rpn_heads.py"?

I tried but didn't succeed. The reason is that when the shape of rpn_cls_logits is (1, 30, H, W) instead of (1, 15, H ,W), the "SpatialNarrowAs" operation cann't be applied to "rpn_labels_int32_wide" since its depth is not the same as rpn_cls_logits.

@Jing-Luo
Copy link

I found the same issue here, and it is great to see you put it up. I guess I will try both loss functions to check which one is better.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants