New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRAFT] Support add training #12417
base: master
Are you sure you want to change the base?
[DRAFT] Support add training #12417
Conversation
12ccdba
to
14612d8
Compare
978e12a
to
3dc789c
Compare
a14d419
to
fd56419
Compare
a8a00ea
to
1a16092
Compare
case ArithmeticType::kSub: | ||
case ArithmeticType::kMul: | ||
case ArithmeticType::kDiv: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are other operations always required to broadcast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was planning to work with broadcasted data (which, to my understanding, is simple copying) for other OPs.
Is there a better alternative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if there is a way to calculate both broadcast and gradient at the same time, it would be great. But it is not related to this PR, so let's think about it later. :)
8ecfd6b
to
ca069c0
Compare
Model with Subtract input_lhs = tf.keras.layers.Input(shape=(10))
input_rhs = tf.keras.layers.Input(shape=(10))
lhs = tf.keras.layers.Dense(10)(input_lhs)
rhs = tf.keras.layers.Dense(10)(input_rhs)
res_sub = tf.keras.layers.Subtract()([lhs, rhs])
output = tf.keras.layers.Dense(10)(res_sub)
model = tf.keras.models.Model(inputs=[input_lhs, input_rhs], outputs=output, name="subtract_training") Data np.random.seed(123)
data_lhs = np.random.rand(3000, 10).astype(np.float32) * 100
data_rhs = np.random.rand(3000, 10).astype(np.float32) * 100
coef_lhs, coef_rhs = np.random.rand(10, 10).astype(np.float32), np.random.rand(10, 10).astype(np.float32)
data_res = np.array([(np.matmul(coef_lhs, x[0]) + np.matmul(coef_rhs, x[1])) for x in zip(data_lhs, data_rhs)], dtype=np.float32) Tensorflow__________________________________________________________________________________________________
Epoch 1/5
150/150 [==============================] - 0s 578us/step - loss: 647.6946 - mae: 20.2884
Epoch 2/5
150/150 [==============================] - 0s 585us/step - loss: 484.7763 - mae: 17.5722
Epoch 3/5
150/150 [==============================] - 0s 581us/step - loss: 382.2013 - mae: 15.5862
Epoch 4/5
150/150 [==============================] - 0s 592us/step - loss: 305.6406 - mae: 13.9046
Epoch 5/5
150/150 [==============================] - 0s 629us/step - loss: 241.3343 - mae: 12.2882 ONERT-Train/home/aeren/Repos/ONE/Product/x86_64-linux.debug/out/bin/onert_train --modelfile /home/aeren/Repos/Scripts/_Product/circle+/result/model20240126_1244/model.circle --load_input:raw /home/aeren/Repos/Scripts/_Product/circle+/data/input.bin --load_expected:raw /home/aeren/Repos/Scripts/_Product/circle+/data/res.bin --epoch 5 --batch_size 20 --learning_rate 0.001 --loss 1 --loss_reduction_type 1 --optimizer 2
Model Expected Filename /home/aeren/Repos/Scripts/_Product/circle+/data/res.bin
Model Input Filename /home/aeren/Repos/Scripts/_Product/circle+/data/input.bin
Model Filename /home/aeren/Repos/Scripts/_Product/circle+/result/model20240126_1244/model.circle
== training parameter ==
- learning_rate = 0.001
- batch_size = 20
- loss_info = {loss = mean squared error, reduction = sum over batch size}
- optimizer = adam
========================
Epoch 1/5 - time: 0.327ms/step - loss: [0] 647.6941
Epoch 2/5 - time: 0.318ms/step - loss: [0] 484.7753
Epoch 3/5 - time: 0.302ms/step - loss: [0] 382.2000
Epoch 4/5 - time: 0.317ms/step - loss: [0] 305.6393
Epoch 5/5 - time: 0.303ms/step - loss: [0] 241.3329 |
Model with Multiply input_lhs = tf.keras.layers.Input(shape=(10))
input_rhs = tf.keras.layers.Input(shape=(10))
lhs = tf.keras.layers.Dense(10)(input_lhs)
rhs = tf.keras.layers.Dense(10)(input_rhs)
res_mul = tf.keras.layers.Multiply()([lhs, rhs])
output = tf.keras.layers.Dense(10)(res_mul)
model = tf.keras.models.Model(inputs=[input_lhs, input_rhs], outputs=output, name="multiply_training") Data np.random.seed(123)
data_lhs = np.random.rand(3000, 10).astype(np.float32) * 100
data_rhs = np.random.rand(3000, 10).astype(np.float32) * 100
coef_lhs, coef_rhs = np.random.rand(10, 10).astype(np.float32), np.random.rand(10, 10).astype(np.float32)
data_res = np.array([(np.matmul(coef_lhs, x[0]) + np.matmul(coef_rhs, x[1])) for x in zip(data_lhs, data_rhs)], dtype=np.float32) Tensorflow__________________________________________________________________________________________________
Epoch 1/5
150/150 [==============================] - 0s 688us/step - loss: 6594.2104 - mae: 64.8041
Epoch 2/5
150/150 [==============================] - 0s 676us/step - loss: 5605.8013 - mae: 59.9134
Epoch 3/5
150/150 [==============================] - 0s 573us/step - loss: 5306.5811 - mae: 58.3457
Epoch 4/5
150/150 [==============================] - 0s 548us/step - loss: 5146.8296 - mae: 57.5273
Epoch 5/5
150/150 [==============================] - 0s 547us/step - loss: 5031.8623 - mae: 56.9231 ONERT-Train/home/aeren/Repos/ONE/Product/x86_64-linux.debug/out/bin/onert_train --modelfile /home/aeren/Repos/Scripts/_Product/circle+/result/model20240126_1329/model.circle --load_input:raw /home/aeren/Repos/Scripts/_Product/circle+/data/input.bin --load_expected:raw /home/aeren/Repos/Scripts/_Product/circle+/data/res.bin --epoch 5 --batch_size 20 --learning_rate 0.001 --loss 1 --loss_reduction_type 1 --optimizer 2
Model Expected Filename /home/aeren/Repos/Scripts/_Product/circle+/data/res.bin
Model Input Filename /home/aeren/Repos/Scripts/_Product/circle+/data/input.bin
Model Filename /home/aeren/Repos/Scripts/_Product/circle+/result/model20240126_1329/model.circle
== training parameter ==
- learning_rate = 0.001
- batch_size = 20
- loss_info = {loss = mean squared error, reduction = sum over batch size}
- optimizer = adam
========================
Epoch 1/5 - time: 0.300ms/step - loss: [0] 6594.2061
Epoch 2/5 - time: 0.288ms/step - loss: [0] 5605.7979
Epoch 3/5 - time: 0.289ms/step - loss: [0] 5306.5791
Epoch 4/5 - time: 0.291ms/step - loss: [0] 5146.8281
Epoch 5/5 - time: 0.291ms/step - loss: [0] 5031.8633 |
3225baf
to
4606d2c
Compare
lhs_grad_map = in_map.array() / rhs_map.array(); | ||
rhs_grad_map = in_map.array() * -lhs_map.array() / rhs_map.array() / rhs_map.array(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing weird outputs :/
Is there something wrong with the following?
@nnfw-bot test tizen-gbs |
90993b0
to
df237cc
Compare
846cbba
to
1c0a425
Compare
@nnfw-bot test onert-cross-debug |
@nnfw-bot test onert-cross-release |
TODO: try other optimizers for division |
. Signed-off-by: YongHyun An <yonghyunz.an@samsung.com>
I've tried following optimizers for division but all of them showed loss values differing from that of tensorflow :/
|
@nnfw-bot test onert-cross-debug |
@nnfw-bot test onert-cross-release |
Draft PR for supporting Add training