New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[onert] Apply BackPropAccumulator layer #12976
[onert] Apply BackPropAccumulator layer #12976
Conversation
After this change, training of the branching graph becomes possible. $ ./Product/x86_64-linux.release/out/bin/onert_train mnist_branched.circle --load_expected:raw out/train.output.1000.bin --load_input:raw out/train.input.1000.bin --loss 1 --loss_reduction_type 1 --optimizer 1 --learn
ing_rate 0.001 --batch_size 1
Model Expected Filename out/train.output.1000.bin
Model Input Filename out/train.input.1000.bin
Model Filename mnist_branched.circle
== training parameter ==
- learning_rate = 0.001
- batch_size = 1
- loss_info = {loss = mean squared error, reduction = sum over batch size}
- optimizer = sgd
========================
Epoch 1/5 - time: 0.196ms/step - loss: [0] 0.0340
Epoch 2/5 - time: 0.175ms/step - loss: [0] 0.0316
Epoch 3/5 - time: 0.186ms/step - loss: [0] 0.0304
Epoch 4/5 - time: 0.197ms/step - loss: [0] 0.0296
Epoch 5/5 - time: 0.180ms/step - loss: [0] 0.0289
===================================
MODEL_LOAD takes 0.3950 ms
PREPARE takes 2.3390 ms
EXECUTE takes 950.4810 ms
- Epoch 1 takes 195.6460 ms
- Epoch 2 takes 175.2290 ms
- Epoch 3 takes 185.9180 ms
- Epoch 4 takes 196.5610 ms
- Epoch 5 takes 179.5790 ms
=================================== |
This commit applies BackPropAccumulator layer to train backend. - Add registering and planing disposable tensors for back-propagation - Make layers use disposable tensors instread of original back-prop tensors - Apply BackPropAccumulator layer to each BackPropTensor ONE-DCO-1.0-Signed-off-by: ragmani <ragmani0216@gmail.com>
e6b2c32
to
e54b48d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ir::OperandInfo backend_info{obj.shape(), obj.typeInfo(), obj.info().memAllocType(), | ||
obj.isConstant()}; | ||
tensor_builder->registerBackwardTensorInfo(ind, backend_info, ir::Layout::NHWC); | ||
tensor_builder->registerBackwardTensorInfo(ind, createBackwardTensorInfo(obj), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give me an example of BackwardTensor
?
In my understanding,
DisposableBackwardTensor
: Tensor for backpropagation. Once propagation is done, It can be freed.BackwardTensor
: Used for backpropagation but can not be freed..? I couldn't think of a concrete example of this tensor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code just registers information of tensors that are only used in backwarding. It does not directly affect memory planning of any tensors. Memory planning of DisposableBackPropTensor
s will be planed in planDisposableBackPropTensors()
below. And other tensors for backwarding do not have memory planning yet because they don't have def/use information.
If you are curious about how memory planning is possible with the function below, I will be happy to provide additional explanation online or offline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
@@ -103,11 +127,52 @@ backend::train::ITensorRegistry *BackendContext::genTrainingTensors() | |||
tensor_builder->notifyBackwardFirstUse(ind); | |||
}); | |||
|
|||
for (const auto &op_index : tgraph.btopolSortOperations()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have another little question.
Is it possible to replace tgraph.btopolSorrtOperations()
with an operation vector(backward_order
) that applied truncatedBackwardOrder
?
ONE/runtime/onert/core/src/compiler/ExecutorFactory.cc
Lines 750 to 753 in 63e81d0
// linearize for backwarding | |
auto backward_order = lowered_graph->trainable_graph().btopolSortOperations(); | |
// get rid of all nodes not reachable from a node with trainable parameters | |
backward_order = lowered_graph->trainable_graph().truncateBackwardOrder(backward_order); |
I thought that if backward is not going to work, there is no need to allocate tensor related to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, This is a bit out of context. So, please proceed without considering this question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your information. I missed out truncateBackwardOrder
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Aeren1564 Are you going to apply truncateBackwardOrder
to all where btopolSortOperations()
is being used? If so, Could you apply it to here as well?
This commit applies BackPropAccumulator layer to train backend.
ONE-DCO-1.0-Signed-off-by: ragmani ragmani0216@gmail.com