You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@vysarge has done a number of benchmark and profiling experiments using using synthetic data comparing Merlin Models. In particular, she compared DLRM with the JoC DLRM TF implementation, whose experiments results can be found in this spreadsheet (Nvidia internal only).
She noticed that
with dropout layer on (with parameter 0.08), MM iteration takes 10.1ms longer
The majority of this time is attributable to additional calls to Mul_GPU_DT_FLOAT_DT_FLOAT_kernel
The text was updated successfully, but these errors were encountered:
@vysarge has done a number of benchmark and profiling experiments using using synthetic data comparing Merlin Models. In particular, she compared DLRM with the JoC DLRM TF implementation, whose experiments results can be found in this spreadsheet (Nvidia internal only).
She noticed that
The text was updated successfully, but these errors were encountered: