Whether SE3 needs pre-training #9

zyk19981118 · 2021-04-13T07:07:22Z

Thank you for your work. I used your reproduced SE3 as a part of my model, but the current test effect is not very good. I guess it may be because I do not have a good understanding of your model. Here are my questions:

Does your model need pre-training?
Can I train SE3 Transformer with the full connection layer that comes after it?
Good advice is also welcome

MattMcPartlon · 2021-05-31T14:20:59Z

I've found that pre-training helps (100 batches, linear weight scale from 1e-6 up to 1e-4). I've also found that smaller depth (2 or 3) works better than larger depth (>3).
I'm not sure what you mean here. The fully connected layer that acts on type-1 features (i.e. 3d-coordinates) in the attention block? Or the linear projection that projects the final output form the dx3 to 1x3 (i.e. projection from the hidden dimension to output dimension).

MattMcPartlon · 2021-05-31T14:22:15Z

Either way, both of these are equivariant operations, so you can train with or without them. I recommend keeping them as-is.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whether SE3 needs pre-training #9

Whether SE3 needs pre-training #9

zyk19981118 commented Apr 13, 2021

MattMcPartlon commented May 31, 2021

MattMcPartlon commented May 31, 2021

Whether SE3 needs pre-training #9

Whether SE3 needs pre-training #9

Comments

zyk19981118 commented Apr 13, 2021

MattMcPartlon commented May 31, 2021

MattMcPartlon commented May 31, 2021