Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import Tensorflow models #84

Open
MarioGK opened this issue Jun 6, 2018 · 5 comments
Open

Import Tensorflow models #84

MarioGK opened this issue Jun 6, 2018 · 5 comments

Comments

@MarioGK
Copy link

MarioGK commented Jun 6, 2018

Is it possible to import TensorFlow models to this library? is it a feature? if not is it planned? if not again xD, is there a way that i can do that manually?

@Sergio0694
Copy link
Owner

Hello, thanks for the suggestion, right now there's not a built-in way to do that.
While I don't plan to support importing TensorFlow models directly, there are two possible solutions to this:

  • You can fork the library and change the visibility of the layers you need (eg. NeuralNetworkNET.Networks.Layers.Cpu.FullyConnectedLayers), so that you'll be able to use the constructor that also takes a pre-loaded weights and biases vectors. You might need to do some minor refactoring to enable this (changing the visibility of a layer might require changing the visibility of some public types used in there, but nothing too time-consuming).
    Once you do that, you'll be able to load your pre-trained weights freely (using the method you want) and pass them to the constructor. Also, make sure to respect the internal ordering of the weights in the library (eg. a fully connected layer weights matrix is a row-first vector with the weights).
  • I'll also update the library to add a few APIs to enable this directly in the future, so that no manual refactoring will be needed 😊

@MarioGK
Copy link
Author

MarioGK commented Jun 6, 2018

Thanks for the reply, yeah when i was reading the code i thought of doing that, btw i really liked your structure and the code, but some of the names are using wrong UpperCamelCase, if i am not wrong UpperCamelCase doesnt apply to acronyms like API,CPU or GPU, would you mind if i renamed those and made a PR?

And another thing would you mind if i separated the project into CPU only and GPU only?
Because GPU requires more dependencies if you are going to use CPU only you still need to import Alea.*.dll

Sorry for using issues this way, but there is nowhere else to talk, maybe a Slack Channel, Discord or IRC something similar would be nice as well.

@Sergio0694
Copy link
Owner

The library generally uses the correct UpperCamelCase format, the only cases where I went with all-caps words are the APIs and cuDNN/cpuDNN namespaces, for personal preference ("APIs" because it stands out more, being in all-caps, as it is the main public namespace, and "cuDNN"/"cpuDNN" to reflect the original cuDNN name by the same nVidia library). I'd honestly like to keep those three namespaces as they are now.

As for having two separate libraries, the project was originally structured that way, but I decided to merge it as it was way easier to maintain, and allowed me to share more code with less trouble between the different modules of the library. I do plan to refactor the library in the future though to implement a better GPU memory management system (right now there's a bit of a memory bottleneck there, between GPU allocations and memory copy operations), so I might consider splitting the library into two different libraries then.

Also, thank you for your kind words, I'm happy to hear you like the code style/structure of the library, and I hope it's not disorienting and easy enough to follow 😄

@Oceania2018
Copy link

@Sergio0694 Consider about SciSharp's TensorFlow.NET.

@Sergio0694
Copy link
Owner

Hi @Oceania2018 - I just found out about that repo and it does look awesome!

I should clarify why I decided to write my own neural network library though, instead of just using TensorFlow or another existing framework/library.
This whole project started out while I was taking my machine learning course at university (I'm currently finishing a master's degree in Computer Engineering), and it was a way to further understand how a neural network (or in general, a computational graph) worked exactly. The plan was not to reach feature parity with TensorFlow, but just to challenge myself to build a somewhat capable library to build and train computational graphs from scratch.
In fact, most of the work went into studying the actual calculus notions necessary to implement all the various operations (forward, backward and gradient calculations, and all the various optimizers) from scratch. You'll see this repo is not using any 3rd party libraries, every single operation is built entirely from scratch (except for some GPU kernels, which use the cuDNN APIs).

In fact, the reason why I'm still working on this library (I'm currently in the middle of a rewrite that will also separate the CPU and GPU versions as @MarioGK asked) is because I plan to add new APIs for reinforcement learning, which will act as another project for me for yet another university exam.

This is all to say, I made this repo public because I thought others might be interested in it, and because it might be useful for other fellow students/engineers/programmers to see how a neural network is actually built from scratch (since libraries like TF are often pretty difficult to look into, if you want to see how a given operation is actually implemented and executed), not because I actually wanted to compete with other top players in this field, like TensorFlow itself or your .NET port/bindings. 😄

I mean, whenever I need to do some work with more complex networks, I'm actually the first to jump back to Python (sigh) and use TensorFlow directly. That said though, I'll definitely give TensorFlow.NET a try in the future, it does look pretty awesome! You guys did a great work on that! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants