Investigate why dropout layer makes training significantly slower. #1037

gabrielspmoreira · 2023-03-24T15:55:17Z

@vysarge has done a number of benchmark and profiling experiments using using synthetic data comparing Merlin Models. In particular, she compared DLRM with the JoC DLRM TF implementation, whose experiments results can be found in this spreadsheet (Nvidia internal only).

She noticed that

with dropout layer on (with parameter 0.08), MM iteration takes 10.1ms longer
The majority of this time is attributable to additional calls to Mul_GPU_DT_FLOAT_DT_FLOAT_kernel

gabrielspmoreira mentioned this issue Mar 24, 2023

[RMP] Performance improvements for Merlin Models NVIDIA-Merlin/Merlin#870

Open

6 tasks

gabrielspmoreira transferred this issue from NVIDIA-Merlin/Merlin Mar 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate why dropout layer makes training significantly slower. #1037

Investigate why dropout layer makes training significantly slower. #1037

gabrielspmoreira commented Mar 24, 2023 •

edited

Investigate why dropout layer makes training significantly slower. #1037

Investigate why dropout layer makes training significantly slower. #1037

Comments

gabrielspmoreira commented Mar 24, 2023 • edited

gabrielspmoreira commented Mar 24, 2023 •

edited