Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor mygrad.nnet.layers #399

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft

Refactor mygrad.nnet.layers #399

wants to merge 8 commits into from

Conversation

davidmascharka
Copy link
Collaborator

@davidmascharka davidmascharka commented Feb 6, 2022

Opening this as a draft so I can get some comments. This changes the overall structure of the mygrad.nnet.layers module. The changes are:

  • Everything that was in the top-level mygrad.nnet.layers module has moved to mygrad.nnet.layers.operations. This reflects the fact that these were all operator implementations of the layer functionality, but don't do any bookkeeping of parameters.
  • The top-level mygrad.nnet.layers module now contains helper class Layers that perform the bookkeeping of parameters. This mimics the structure of mynn.layers.

I went ahead and added a BatchNorm layer as an initial example, and will port over Conv, Dense (or Linear or whatever we want to name it), Dropout, and PReLU when we're happy with the design. Main questions:

  • Does this seem reasonable? Basically mimics how PyTorch has torch.nn where the classes live and torch.nn.functional where the operations live.
  • What kinds of tests are reasonable here? Obviously some that cover instance creation and basic input validation. Anything else we should try to cover?
  • How do we want to structure the updates? I'm thinking we could either (1) do it in one swoop and add all the layers here or (2) first just move mygrad.nnet.layers -> mygrad.nnet.layers.operations, then one-by-one add in layers each as their own release.

@rsokl
Copy link
Owner

rsokl commented Feb 19, 2022

First off, thanks for working on this! It will be really nice to get mynn merged into mygrad 😄

Does this seem reasonable? Basically mimics how PyTorch has torch.nn where the classes live and torch.nn.functional where the operations live.

I am in favor of mygrad.nnet.layers -> mygrad.nnet.layers.operations, but this shouldn't be a compatibility-breaking change. So max_pool, conv_nd etc, should all still be made available in the mygrad.nnet.layers namespace. Otherwise this could be a badly-breaking change for a lot of notebooks out there. This would also prevent you from needing to make any changes in mynn - it would "just work" even though the library will effectively be obsolete.

After this change, we should check to see if the docs will need to be updates as well. E.g. both the filename and contents of mygrad.nnet.layers.batchnorm.rst are path-dependent. But if we maintain the namespaces, as mentioned above, I think things will be OK.

What kinds of tests are reasonable here? Obviously some that cover instance creation and basic input validation. Anything else we should try to cover?

We don't need anything fancy here, I think the following is sufficient:

  • If the layer takes initializers, test the zeros and ones, respectively, propagates as-expected; this will also serve to test that .parameters() works as-expected
  • For __call__, no need for any numerical testing, just verify that the computational graph contains the proper operations. E.g. conv w/ bias should:
conv_layer = ConvNDLayer(..., bias=True)
out = conv_layer(data)  # `data` can be trivial - no need for hypothesis

assert isinstance(out.creator, Add)
conv_out, bias = out.creator.variables
assert bias is conv_layer.bias
assert isinstance(conv_out.creator, ConvND) # the operation-class, not the layer

x, w = conv_out.creator.variables
assert w is conv_layer.w
assert x is data

How do we want to structure the updates? I'm thinking we could either (1) do it in one swoop and add all the layers here or (2) first just move mygrad.nnet.layers -> mygrad.nnet.layers.operations, then one-by-one add in layers each as their own release.

I am in favor of (2)

Perhaps we should make a super-simple base class, just with a __call__ and .parameters, just so we have the benefits of a hierarchy. In a later PR, I might think about adding functionality for saving/loading, but it's not a big deal.

@davidmascharka
Copy link
Collaborator Author

👍 this all makes sense. thanks for reading through! I'll make these changes over the next few(?) days and get the ball rolling on this merge

@rsokl
Copy link
Owner

rsokl commented Feb 27, 2022

It occurred to me that, for testing __call__ of modules, we should also check that backprop propagates gradients to the module's parameters. No need to check for numerical correctness (checking that the computational graph is as-expected is sufficient for that), we simply want to make sure that the parameters aren't initialized as constants (which could be via constant=True, integer-valued biases, or initialization as a numpy array).

So basically:

conv_layer = ConvNDLayer(..., bias=True)
out = conv_layer(data)  # `data` can be trivial - no need for hypothesis

assert isinstance(out.creator, Add)
conv_out, bias = out.creator.variables
assert bias is conv_layer.bias
assert isinstance(conv_out.creator, ConvND) # the operation-class, not the layer

x, w = conv_out.creator.variables
assert w is conv_layer.w
assert x is data

assert w.grad is None
assert bias.grad is None

out.backward()

assert w.grad is not None
assert bias.grad is not None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants