XavierFill correctness #2531

Tezirg-Wrld3D · 2018-06-26T15:01:52Z

I was looking at the implementation of the XavierFill operator : https://github.com/caffe2/caffe2/blob/0dd3284525079f3870df92f61bed3b94eb45ff53/caffe2/operators/filler_op.h#L434

But if we look at the formula 16 in the original paper: http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf
We can see that the values SHOULD be uniformly sampled in

[
- sqrt(6.0 / (output->size() +  output->dim32(0))), 
sqrt(6.0 / (output->size() + output->dim32(0)))
]

For my particular use case, I have an approxiamte factor 10^2 difference. Can someone clarify why this XavierFill implementation doesnt look correct ?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XavierFill correctness #2531

XavierFill correctness #2531

Tezirg-Wrld3D commented Jun 26, 2018

XavierFill correctness #2531

XavierFill correctness #2531

Comments

Tezirg-Wrld3D commented Jun 26, 2018