Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For ANN (c++) Remove Templates for OutputLayerType and InitializationRuleType #3542

Closed
akropp opened this issue Oct 12, 2023 · 7 comments
Closed

Comments

@akropp
Copy link
Contributor

akropp commented Oct 12, 2023

What is the desired addition or change?

Remove the OutputLayerType and InitializationRuleType template parameters from the FFN and RNN classes. Use inheritance/base classes instead.

What is the motivation for this feature?

The fact that the OutputLayerType and InitializationRuleType for FFN/RNN are template parameters is very limiting when trying to write code to support dynamic initialization of networks (e.g. initializing a network from json/yaml/command line, or potentially a CLI-type interface). There is no reason why these couldn't instead extend from a common base class the way layers do, and then be passed in to the network at construction time.

If applicable, describe how this feature would be implemented.

Pretty self evident. Create two base classes: OutputLayerType and InitializationRuleType, which all such implementations extend. FFN and RNN contain a pointer to an OutputLayerType instance and an InitializationRuleType instance, which are passed in to the constructor, presumably via std::unique_ptr. Alternately, the constructor could have template arguments allowing for in-place construction of these members.

Additional information?

I am happy to implement if people think this is a good idea. I realize it would be a breaking change though. I guess an alternate version would be to create some sort of meta-network container to hold an arbitrarily-templatized FFN or RNN.

@conradsnicta
Copy link
Contributor

I am happy to implement if people think this is a good idea. I realize it would be a breaking change though. I guess an alternate version would be to create some sort of meta-network container to hold an arbitrarily-templatized FFN or RNN.

Due to the breaking changes, it may be better to create an alternative/secondary ANN framework within mlpack which would contain the proposed changes. @zoq may have further ideas / comments about this.

@rcurtin
Copy link
Member

rcurtin commented Oct 18, 2023

I understand the problem you're trying to solve, and spent quite a while trying to formulate an opinion on a minimally intrusive way to solve the problem.

It's not possible to entirely remove template parameters from the FFN class and thereby ease dynamic creation of networks completely, because of the MatType parameter. That one has to stay, because there are lots of possible MatTypes a user might want to use (and because each layer depends on MatType as a template argument).

But, the first two, OutputLayerType and InitializationRuleType, I do think could be removed reasonably. Given that layers are already using virtual inheritance, it makes sense to apply the same concept to the output layer. And the initialization rule is not called often either.

My suggestion for a way forward might be this:

  • Create a base OutputLayer class (it will need to take MatType as a template parameter), much like the Layer base class. All of the output layers in loss_functions/ can inherit from it.
  • The same can be done for a base InitializationRule class (or some similar name).
  • For the next major version of mlpack, we can drop those two template parameters.
  • For mlpack 4, and for your application, you can use a "shim": if you just use the base OutputLayer and InitializationRule classes as the template parameters for FFN, then you can have only one FFN<...> type that needs to be handled in your application.

But, I do want to point out that this does not solve the problem of translating the user-specified output layer type, given as a string in JSON or YAML or on the command-line or whatever, into the actual class that you want to use. In fact, that problem isn't any easier with the proposed refactoring here---it's the same amount of tedium whether templates or virtual inheritance is used. So, it's worth thinking about whether the effort involved here would actually solve your issue.

I agree that @zoq's thoughts would be helpful here too, since he designed the original structure of the code. 👍

(Also: if you're making a command-line program or similar that can accept a network definition from a configuration file or string... we've been wanting to do that for a while---see #1254. Your needs may be different than what was hoped for there, but if there's enough overlap and interest, maybe whatever you're working on could be contributed directly. Just a thought :))

Copy link

mlpack-bot bot commented Nov 17, 2023

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

@mlpack-bot mlpack-bot bot added the s: stale label Nov 17, 2023
@akropp
Copy link
Contributor Author

akropp commented Nov 17, 2023

I totally agree that MatType still needs to be a template parameter, so this doesn't totally alleviate that issue. However, even just in more abstract terms, it seems like it would be nice for OutputLayerType to be added just like any other layer -- there's nothing particularly special about the output layer. Same with the InitializationRuleType. I might want to take an already-constructed network and just try different initialization types or different output types -- those don't really seem like totally new networks. My issue isn't so much that you need to specify them at construction, it is that different template parameters create different TYPES. So I can't have a "network holder class" that abstracts away my network, with a pointer to the FFN object without going through some contortions. I need to create some intermediary holding classes that hide/remove that type information. Obviously doable, but it seems unnecessary in the first place given the layers are already implemented using virtual inheritance.

I did see #1254 when I was writing this -- I had that in mind as a side-benefit.

@mlpack-bot mlpack-bot bot removed the s: stale label Nov 17, 2023
@rcurtin
Copy link
Member

rcurtin commented Nov 22, 2023

I think we agree. I don't want to remove the template parameters until the next major release of mlpack, though, so as to not break reverse compatibility. I think that a "shim" type like I described above will work for that, though... did I overlook something?

@akropp
Copy link
Contributor Author

akropp commented Nov 22, 2023

No, I think that is fine.

Copy link

mlpack-bot bot commented Dec 22, 2023

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

@mlpack-bot mlpack-bot bot added the s: stale label Dec 22, 2023
@mlpack-bot mlpack-bot bot closed this as completed Dec 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants