Dnn2 #2663
Replies: 9 comments 14 replies
-
Notice, layers will be defined left to right, not right to left! This seems more sensible |
Beta Was this translation helpful? Give feedback.
-
A repeated layer will be used like so:
|
Beta Was this translation helpful? Give feedback.
-
For reusable blocks, i expect to be able to do this:
Something like that. It's definitely doable. |
Beta Was this translation helpful? Give feedback.
-
The main reason for doing this is that compilers are insanely good at compiling lambdas, it's ridiculous. So using them would be a win. Then having the lambdas collapse to something that wraps If we're clever we could get it to reduce down to |
Beta Was this translation helpful? Give feedback.
-
Example usage of tags for resnet blocks:
something like that |
Beta Was this translation helpful? Give feedback.
-
Automatic serialization is trivially done, so no worries there. |
Beta Was this translation helpful? Give feedback.
-
Hi @pfeatherstone I just saw this episode of C++ Weekly released about type erasure. I was thinking if the idea of having some kind of Maybe it's already what you're doing, I didn't have time to go through your PR. |
Beta Was this translation helpful? Give feedback.
-
Been a while since i gave an update. I stopped working on this weeks ago. I've really got into transformers lately (they are awesome!). I don't think dlib can support transformers without substantial changes to the core. So i lost interest in this since i don't think i'll ever use dlib for neural nets again unless it's a super simple classifier. |
Beta Was this translation helpful? Give feedback.
-
Hi, Are there any updates or is there any pilot code we could use to try finish the dnn2 rework. |
Beta Was this translation helpful? Give feedback.
-
So still working on dnn2. Here is my proposal to really bring down compile time.
relu_
,con_
etc will not be templated. All parameters will be defined at runtime. So constructors will take all the parametersadd_layer
in normal dnn module, I will have abuffered_layer
which will have all the common state. Rather than it be templated on the layer details (likeadd_layer
does), it will erase the layer using type erasure. I will have a small buffer optimization designed so most layers are stored on the stack. Otherwise if they overrun the SBO, they will be stored on the heap. This means thatbuffered_layer
is not a template anymore.Once evaluated, it will just return something that wraps
std::vector<buffered_layer>
. And you will construct it like so:affine
, and simply define.eval()
and.train()
like in torch. This will only affect layers likebn_
anddropout_
.I think this will substantially reduce compile times without having to do too much of a re-write (I don't need a torch like tensor class with autograd or anything like that. Most of the layer details are copy-pasted).
The API will look very similar. Indeed,
is not too different to:
Beta Was this translation helpful? Give feedback.
All reactions