Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training does not match Matterport #3

Open
bileki opened this issue Jan 16, 2020 · 0 comments
Open

Training does not match Matterport #3

bileki opened this issue Jan 16, 2020 · 0 comments

Comments

@bileki
Copy link

bileki commented Jan 16, 2020

I'm training a resnet50 (custom backbone/configs/maskrcnn.cfg) with a grocery products dataset.
Even after several attempts the model stagnates at a certain val_loss close to 3.5, however the same training with the same parameters using Matterport is very different (better).

In addition the weights have a relative difference, while in Nearthlab is 172MB, with Matterport is 180MB.

Given that the difference between the two frameworks in the case of resnet50 is just the name of the layers, what could be causing this problem?

Any optimization during training that modifies weights?

Because I realized that Matterpport's resnet50 doesn't work on a Jetson Nano, whereas with Nearthlab it does.

Am I doing something wrong? Should I set something else to work?

Logs:

Nearthlab :

Epoch 1/100
100/100 [==============================] - 113s 1s/step - loss: 9.2259 - val_loss: 6.9288
Epoch 2/100
100/100 [==============================] - 58s 580ms/step - loss: 7.7932 - val_loss: 5.3660
Epoch 3/100
100/100 [==============================] - 59s 590ms/step - loss: 6.9681 - val_loss: 4.7230
Epoch 4/100
100/100 [==============================] - 59s 593ms/step - loss: 6.2129 - val_loss: 4.0161
Epoch 5/100
100/100 [==============================] - 59s 587ms/step - loss: 5.5432 - val_loss: 3.9557
Epoch 6/100
100/100 [==============================] - 59s 586ms/step - loss: 5.0773 - val_loss: 3.4220
Epoch 7/100
100/100 [==============================] - 58s 582ms/step - loss: 4.5639 - val_loss: 3.5953
Epoch 8/100
100/100 [==============================] - 59s 590ms/step - loss: 4.0430 - val_loss: 3.5199
Epoch 9/100
100/100 [==============================] - 58s 582ms/step - loss: 3.8537 - val_loss: 3.5656
Epoch 10/100
100/100 [==============================] - 57s 574ms/step - loss: 3.8129 - val_loss: 3.4385
..... val_loss stagnates
Epoch 15/100
100/100 [==============================] - 58s 576ms/step - loss: 3.1963 - val_loss: 3.9125
.....
Epoch 38/100
100/100 [==============================] - 61s 610ms/step - loss: 2.8903 - val_loss: 3.3682
.....

Matterport:

Epoch 1/100
100/100 [==============================] - 112s 1s/step - loss: 3.8189 - rpn_class_loss: 0.3061 - rpn_bbox_loss: 0.8670 - mrcnn_class_loss: 1.1758 - mrcnn_bbox_loss: 0.8026 - mrcnn_mask_loss: 0.6674 - val_loss: 3.4789 - val_rpn_class_loss: 0.1830 - val_rpn_bbox_loss: 0.6418 - val_mrcnn_class_loss: 1.2534 - val_mrcnn_bbox_loss: 0.7054 - val_mrcnn_mask_loss: 0.6953
Epoch 2/100
100/100 [==============================] - 95s 945ms/step - loss: 3.3036 - rpn_class_loss: 0.1090 - rpn_bbox_loss: 0.5533 - mrcnn_class_loss: 1.2661 - mrcnn_bbox_loss: 0.6808 - mrcnn_mask_loss: 0.6943 - val_loss: 3.3427 - val_rpn_class_loss: 0.0787 - val_rpn_bbox_loss: 0.5562 - val_mrcnn_class_loss: 1.3297 - val_mrcnn_bbox_loss: 0.6842 - val_mrcnn_mask_loss: 0.6938
Epoch 3/100
100/100 [==============================] - 94s 936ms/step - loss: 3.1970 - rpn_class_loss: 0.0686 - rpn_bbox_loss: 0.4841 - mrcnn_class_loss: 1.2985 - mrcnn_bbox_loss: 0.6518 - mrcnn_mask_loss: 0.6940 - val_loss: 3.0237 - val_rpn_class_loss: 0.0571 - val_rpn_bbox_loss: 0.4428 - val_mrcnn_class_loss: 1.1865 - val_mrcnn_bbox_loss: 0.6436 - val_mrcnn_mask_loss: 0.6938
Epoch 4/100
100/100 [==============================] - 95s 947ms/step - loss: 3.1986 - rpn_class_loss: 0.0728 - rpn_bbox_loss: 0.4853 - mrcnn_class_loss: 1.3238 - mrcnn_bbox_loss: 0.6261 - mrcnn_mask_loss: 0.6905 - val_loss: 3.0983 - val_rpn_class_loss: 0.0509 - val_rpn_bbox_loss: 0.4753 - val_mrcnn_class_loss: 1.2831 - val_mrcnn_bbox_loss: 0.5961 - val_mrcnn_mask_loss: 0.6928
Epoch 5/100
100/100 [==============================] - 91s 914ms/step - loss: 3.0763 - rpn_class_loss: 0.0421 - rpn_bbox_loss: 0.4181 - mrcnn_class_loss: 1.3182 - mrcnn_bbox_loss: 0.6055 - mrcnn_mask_loss: 0.6924 - val_loss: 3.0088 - val_rpn_class_loss: 0.0330 - val_rpn_bbox_loss: 0.4035 - val_mrcnn_class_loss: 1.2702 - val_mrcnn_bbox_loss: 0.6103 - val_mrcnn_mask_loss: 0.6918
Epoch 6/100
100/100 [==============================] - 90s 902ms/step - loss: 3.0606 - rpn_class_loss: 0.0542 - rpn_bbox_loss: 0.4186 - mrcnn_class_loss: 1.3160 - mrcnn_bbox_loss: 0.5800 - mrcnn_mask_loss: 0.6919 - val_loss: 2.8770 - val_rpn_class_loss: 0.0678 - val_rpn_bbox_loss: 0.4394 - val_mrcnn_class_loss: 1.1240 - val_mrcnn_bbox_loss: 0.5563 - val_mrcnn_mask_loss: 0.6894
Epoch 7/100
100/100 [==============================] - 94s 940ms/step - loss: 2.9551 - rpn_class_loss: 0.0400 - rpn_bbox_loss: 0.4016 - mrcnn_class_loss: 1.2583 - mrcnn_bbox_loss: 0.5634 - mrcnn_mask_loss: 0.6917 - val_loss: 2.9014 - val_rpn_class_loss: 0.0453 - val_rpn_bbox_loss: 0.3620 - val_mrcnn_class_loss: 1.2578 - val_mrcnn_bbox_loss: 0.5445 - val_mrcnn_mask_loss: 0.6919
Epoch 8/100
100/100 [==============================] - 91s 908ms/step - loss: 2.9892 - rpn_class_loss: 0.0621 - rpn_bbox_loss: 0.4153 - mrcnn_class_loss: 1.2762 - mrcnn_bbox_loss: 0.5460 - mrcnn_mask_loss: 0.6895 - val_loss: 2.7237 - val_rpn_class_loss: 0.0407 - val_rpn_bbox_loss: 0.3425 - val_mrcnn_class_loss: 1.1253 - val_mrcnn_bbox_loss: 0.5290 - val_mrcnn_mask_loss: 0.6863
Epoch 9/100
100/100 [==============================] - 92s 920ms/step - loss: 2.7157 - rpn_class_loss: 0.0341 - rpn_bbox_loss: 0.3368 - mrcnn_class_loss: 1.1446 - mrcnn_bbox_loss: 0.5110 - mrcnn_mask_loss: 0.6891 - val_loss: 2.8300 - val_rpn_class_loss: 0.0302 - val_rpn_bbox_loss: 0.3824 - val_mrcnn_class_loss: 1.1554 - val_mrcnn_bbox_loss: 0.5717 - val_mrcnn_mask_loss: 0.6903
Epoch 10/100
100/100 [==============================] - 90s 903ms/step - loss: 2.8210 - rpn_class_loss: 0.0414 - rpn_bbox_loss: 0.3935 - mrcnn_class_loss: 1.1850 - mrcnn_bbox_loss: 0.5123 - mrcnn_mask_loss: 0.6889 - val_loss: 2.5782 - val_rpn_class_loss: 0.0272 - val_rpn_bbox_loss: 0.3071 - val_mrcnn_class_loss: 1.0568 - val_mrcnn_bbox_loss: 0.4993 - val_mrcnn_mask_loss: 0.6878

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant