What should I do after train_ofa_net #57

detectRecog · 2021-07-20T03:47:28Z

I run train_ofa_net.py and there is three folders under 'exp/': 'kernel2kernel_depth', 'kernel_depth2kernel_depth_width', 'normal2kernel'. Then, what should I do next? There are 'checkpoint logs net.config net_info.txt run.config' under each exp subfolder after training. Anybody knows how should I deal with it?

I can not find any relations between the training exp results and 'eval_ofa_net.py'. Please help this poor kid. \doge

Bixiii · 2021-07-22T16:38:27Z

As far as I can tell, the folders are the different stages of the progressive shrinking algorithm, kernel2kernel_depth is the training step from elastic kernel to elastic kernel and elastic depth. In the checkpoint folder you can find the trained models , model_best.pth.tar should be the final model for that step. When you want to evaluate the model you trained yourself, you have to load them in the eval_ofa_net.py script.
For that you can just replace

ofa_network = ofa_net(args.net, pretrained=True)

with something to load your own network. Maybe something like this would work:

ofa_network = OFAMobileNetV3(
    ks_list=[3, 5, 7],
    expand_ratio_list=[3, 4, 6],
    depth_list=[2, 3, 4],
)       
init = torch.load('exp/kernel_depth2kernel_depth_width/phase2/checkpoint/model_best_pth.tar',map_location='cpu')['state_dict']
ofa_network.load_state_dict(init)

detectRecog · 2021-07-23T03:21:47Z

As far as I can tell, the folders are the different stages of the progressive shrinking algorithm, kernel2kernel_depth is the training step from elastic kernel to elastic kernel and elastic depth. In the checkpoint folder you can find the trained models , model_best.pth.tar should be the final model for that step. When you want to evaluate the model you trained yourself, you have to load them in the eval_ofa_net.py script.
For that you can just replace
ofa_network = ofa_net(args.net, pretrained=True)
with something to load your own network. Maybe something like this would work:
ofa_network = net = OFAMobileNetV3(
    ks_list=[3, 5, 7],
    expand_ratio_list=[3, 4, 6],
    depth_list=[2, 3, 4],
)       
init = torch.load('exp/kernel_depth2kernel_depth_width/phase2/checkpoint/model_best_pth.tar',map_location='cpu')['state_dict']
ofa_network.load_state_dict(init)

You're so kind. Thank you very much for your reply as I'm waiting for someone to save me everyday. Does this mean I should train for different stages sequentially with resuming the best checkpoint of the previous stage? Currently, I train different stages in parallel. And this is why I struggled to find the relations between checkpoints at different stages.

@Bixiii

Jon-drugstore · 2021-07-27T06:29:29Z

As far as I can tell, the folders are the different stages of the progressive shrinking algorithm, kernel2kernel_depth is the training step from elastic kernel to elastic kernel and elastic depth. In the checkpoint folder you can find the trained models , model_best.pth.tar should be the final model for that step. When you want to evaluate the model you trained yourself, you have to load them in the eval_ofa_net.py script.
For that you can just replace
ofa_network = ofa_net(args.net, pretrained=True)
with something to load your own network. Maybe something like this would work:
ofa_network = net = OFAMobileNetV3(
    ks_list=[3, 5, 7],
    expand_ratio_list=[3, 4, 6],
    depth_list=[2, 3, 4],
)       
init = torch.load('exp/kernel_depth2kernel_depth_width/phase2/checkpoint/model_best_pth.tar',map_location='cpu')['state_dict']
ofa_network.load_state_dict(init)

Do you have any ideas for the detail of latency predictor model? how to build the network ? Thanks for your replay!

pyjhzwh · 2021-09-07T15:04:27Z

As far as I can tell, the folders are the different stages of the progressive shrinking algorithm, kernel2kernel_depth is the training step from elastic kernel to elastic kernel and elastic depth. In the checkpoint folder you can find the trained models , model_best.pth.tar should be the final model for that step. When you want to evaluate the model you trained yourself, you have to load them in the eval_ofa_net.py script.
For that you can just replace
ofa_network = ofa_net(args.net, pretrained=True)
with something to load your own network. Maybe something like this would work:
ofa_network = net = OFAMobileNetV3(
    ks_list=[3, 5, 7],
    expand_ratio_list=[3, 4, 6],
    depth_list=[2, 3, 4],
)       
init = torch.load('exp/kernel_depth2kernel_depth_width/phase2/checkpoint/model_best_pth.tar',map_location='cpu')['state_dict']
ofa_network.load_state_dict(init)
Do you have any ideas for the detail of latency predictor model? how to build the network ? Thanks for your replay!

In my understanding, once-for-all/ofa/nas/efficiency_predictor/latency_lookup_table.py describes how do they estimate the latency. For ResNet50, they just count FLOPs to represent latency

pyjhzwh · 2021-09-07T15:08:42Z

As far as I can tell, the folders are the different stages of the progressive shrinking algorithm, kernel2kernel_depth is the training step from elastic kernel to elastic kernel and elastic depth. In the checkpoint folder you can find the trained models , model_best.pth.tar should be the final model for that step. When you want to evaluate the model you trained yourself, you have to load them in the eval_ofa_net.py script.
For that you can just replace
ofa_network = ofa_net(args.net, pretrained=True)
with something to load your own network. Maybe something like this would work:
ofa_network = net = OFAMobileNetV3(
    ks_list=[3, 5, 7],
    expand_ratio_list=[3, 4, 6],
    depth_list=[2, 3, 4],
)       
init = torch.load('exp/kernel_depth2kernel_depth_width/phase2/checkpoint/model_best_pth.tar',map_location='cpu')['state_dict']
ofa_network.load_state_dict(init)
You're so kind. Thank you very much for your reply as I'm waiting for someone to save me everyday. Does this mean I should train for different stages sequentially with resuming the best checkpoint of the previous stage? Currently, I train different stages in parallel. And this is why I struggled to find the relations between checkpoints at different stages.

@Bixiii

I guess so. from task 'kernel' to 'depth', the depth list has more choices, from 'depth' to 'expand', the depth_list has more choices. I guess we should run task 'kernel' first, then 'depth', finally 'expand'.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What should I do after train_ofa_net #57

What should I do after train_ofa_net #57

detectRecog commented Jul 20, 2021 •

edited

Bixiii commented Jul 22, 2021 •

edited

detectRecog commented Jul 23, 2021 •

edited

Jon-drugstore commented Jul 27, 2021

pyjhzwh commented Sep 7, 2021

pyjhzwh commented Sep 7, 2021

What should I do after train_ofa_net #57

What should I do after train_ofa_net #57

Comments

detectRecog commented Jul 20, 2021 • edited

Bixiii commented Jul 22, 2021 • edited

detectRecog commented Jul 23, 2021 • edited

Jon-drugstore commented Jul 27, 2021

pyjhzwh commented Sep 7, 2021

pyjhzwh commented Sep 7, 2021

detectRecog commented Jul 20, 2021 •

edited

Bixiii commented Jul 22, 2021 •

edited

detectRecog commented Jul 23, 2021 •

edited