[DRAFT][onert-micro] PoC_V1: Training for onert-micro #12892

BalyshevArtem · 2024-04-22T10:23:42Z

This is draft first version of training for onert-micro. Similar to #12873 (comment) proposal.
Note: in this draft I added gradient related operation into main circle schema, not into circle+.
This draft also adds the necessary infrastructure to create a back propagation graph: circle-weight-divider and circle_training_configure tools (note: in #12873 (comment) proposal these two tools merged into one TrainingConfigureTool).

This draft introduces full refactored onert-micro. ONE-DCO-1.0-Signed-off-by: Artem Balyshev <a.balyshev@samsung.com>

chunseoklee · 2024-04-24T07:52:11Z

onert-micro/eval-driver/TrainingDriver.cpp

+                                  reinterpret_cast<float *>(cur_train_target_data),
+                                  target_size);
+#endif
+        train_interpreter.backward();


@BalyshevArtem This approach (backward for each sample in a batch) does not produce the same result with backward mean/sum over batch approach ?

I'm not sure I understood the question correctly. In this approach, for each current data example, we calculate the gradient and summarize it for all examples in the current batch sample. And then, using the updateWeights method, according to the chosen optimization technique (SGD, ADAM), we update the weights using the calculated gradients on the batch_size sample. Thus, we somehow take into account the gradients calculated on the entire batch_size sample

we calculate the gradient and summarize it for all examples in the current batch sample.

Oh, I will look into more. Could you please point to where we summerize the loss ?

ONE/onert-micro/onert-micro/src/core/train/OMTrainingRuntimeModule.cpp

Lines 448 to 454 in ad3942d

if (_training_storage.getOptimizationStrategy() == SGD)

{

for (uint32_t j = 0; j < output_size; ++j)

{

grad_data_f[j] += calculated_data_f[j];

}

} else

- here summarize calculated gradients (SGD).

ONE/onert-micro/onert-micro/src/core/train/OMTrainingRuntimeModule.cpp

Lines 471 to 473 in ad3942d

void OMTrainingRuntimeModule::updateSGDWeights(uint8_t *dest, uint8_t *src, size_t size)

{

assert(dest != nullptr); // Check caller

- here updated weigths

jyoungyun · 2024-04-26T03:11:59Z

onert-micro/eval-driver/TrainingDriver.cpp

-  printf("MSE_ERROR TRAIN = %f\n", mse_result);
+  for (uint32_t e = 0; e < training_epochs; ++e)
+  {
+    train_interpreter.set_training_mode(true);


@BalyshevArtem

Does this code support transfer learning?

Do you mean by transfer learning that only certain last layers are trained? If yes, then TrainingConfigure tool is responsible for how many and which specific layers are trained in this draft for proposal_1 from #12873 (comment).
from TrainingDriver we cannot change and choose which layers will be trained in this variant.

I understood that the backprop graph contains which layers will be trained, and TrainingDriver cannot change the training layer information other than turning training on and off. I'm trying to add trainable property in circle+ file format. So I wonder how you handle the trasfer learning feature. Thank you for your explanation.

Artem Balyshev added 20 commits February 28, 2024 12:45

[DRAFT][onert-micro] Introduce refactored onert-micro

ed89b25

This draft introduces full refactored onert-micro. ONE-DCO-1.0-Signed-off-by: Artem Balyshev <a.balyshev@samsung.com>

move pal, fix destructor of RuntimeGraph, fix reset as reset all

2e39e76

remove OMShape

6f20b0a

change file data

41247b0

add assert to check div by zero and rewrite checkKernel function

42b3829

add weight divider.

7ea63fa

support div_by_id in circle_divider and in onert-micro

5a6c7fc

support creating train graph and creating metadata with indexes_tables

c8b6b63

support training in onert-micro

9227ef6

some refactor

25c968d

some debug changes

d3ee79d

add batch training

fd56da9

fix this working part

4ad8b8c

support boston training

77d6b8b

add batch

147f4b8

support conv2d backprop

0e6f143

next step in supporting

711580e

final

b70a3ff

fixings details

3c0f9d2

tmp

31057ac

BalyshevArtem added PR/NO TEST Tell CI to not run test PR/NO MERGE Please don't merge. I'm still working on this :) DRAFT A draft issue or PR for sharing one's current working status and discussion. labels Apr 22, 2024

BalyshevArtem force-pushed the first_train_draft_onert_micro branch from d1c7ef1 to c107b1f Compare April 22, 2024 10:32

posted draft version V1

ad3942d

BalyshevArtem force-pushed the first_train_draft_onert_micro branch from c107b1f to ad3942d Compare April 22, 2024 15:33

chunseoklee reviewed Apr 24, 2024

View reviewed changes

jyoungyun reviewed Apr 26, 2024

View reviewed changes

chunseoklee mentioned this pull request May 14, 2024

How to save(checkpoint) training state into file ? #12997

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT][onert-micro] PoC_V1: Training for onert-micro #12892

[DRAFT][onert-micro] PoC_V1: Training for onert-micro #12892

BalyshevArtem commented Apr 22, 2024

chunseoklee Apr 24, 2024

BalyshevArtem Apr 24, 2024

chunseoklee Apr 24, 2024

BalyshevArtem Apr 24, 2024

jyoungyun Apr 26, 2024

BalyshevArtem Apr 26, 2024

jyoungyun Apr 28, 2024

	if (_training_storage.getOptimizationStrategy() == SGD)
	{
	for (uint32_t j = 0; j < output_size; ++j)
	{
	grad_data_f[j] += calculated_data_f[j];
	}
	} else

	void OMTrainingRuntimeModule::updateSGDWeights(uint8_t dest, uint8_t src, size_t size)
	{
	assert(dest != nullptr); // Check caller

[DRAFT][onert-micro] PoC_V1: Training for onert-micro #12892

Are you sure you want to change the base?

[DRAFT][onert-micro] PoC_V1: Training for onert-micro #12892

Conversation

BalyshevArtem commented Apr 22, 2024

chunseoklee Apr 24, 2024

Choose a reason for hiding this comment

BalyshevArtem Apr 24, 2024

Choose a reason for hiding this comment

chunseoklee Apr 24, 2024

Choose a reason for hiding this comment

BalyshevArtem Apr 24, 2024

Choose a reason for hiding this comment

jyoungyun Apr 26, 2024

Choose a reason for hiding this comment

BalyshevArtem Apr 26, 2024

Choose a reason for hiding this comment

jyoungyun Apr 28, 2024

Choose a reason for hiding this comment