Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whether SE3 needs pre-training #9

Open
zyk19981118 opened this issue Apr 13, 2021 · 2 comments
Open

Whether SE3 needs pre-training #9

zyk19981118 opened this issue Apr 13, 2021 · 2 comments

Comments

@zyk19981118
Copy link

Thank you for your work. I used your reproduced SE3 as a part of my model, but the current test effect is not very good. I guess it may be because I do not have a good understanding of your model. Here are my questions:

  1. Does your model need pre-training?
  2. Can I train SE3 Transformer with the full connection layer that comes after it?
    Good advice is also welcome
@MattMcPartlon
Copy link

  1. I've found that pre-training helps (100 batches, linear weight scale from 1e-6 up to 1e-4). I've also found that smaller depth (2 or 3) works better than larger depth (>3).
  2. I'm not sure what you mean here. The fully connected layer that acts on type-1 features (i.e. 3d-coordinates) in the attention block? Or the linear projection that projects the final output form the dx3 to 1x3 (i.e. projection from the hidden dimension to output dimension).

@MattMcPartlon
Copy link

Either way, both of these are equivariant operations, so you can train with or without them. I recommend keeping them as-is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants