Non-trainable parameters? #921

hovinen · 2024-03-01T16:14:05Z

I would like to set up a network in which all of the parameters of one of the linear layers are hard-coded and do not change through training. In other libraries such as PyTorch, one can do this by clearing flag requires_grad on the parameters one wishes to hold fixed. I can't find any equivalent in the dfdx documentation, nor any mention of the terms "non-trainable" or similar.

Does dfdx support this at all? If so, how does one set this up?

The text was updated successfully, but these errors were encountered:

swfsql · 2024-03-01T16:46:57Z

I'm not entirely sure, but I believe you can create a wrapper structure that defines how the forward_mut method behaves (assuming you want to implement a Module), and in that method when using the linear layers that you intend to not train, instead of calling their forward_mut methods you'd call the forward instead. But I'm not sure how you'd need to go about the Tapes on the inputs data, maybe it can be kept the same.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-trainable parameters? #921

Non-trainable parameters? #921

hovinen commented Mar 1, 2024

swfsql commented Mar 1, 2024 •

edited

Non-trainable parameters? #921

Non-trainable parameters? #921

Comments

hovinen commented Mar 1, 2024

swfsql commented Mar 1, 2024 • edited

swfsql commented Mar 1, 2024 •

edited