You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@vysarge has done a number of benchmark and profiling experiments using using synthetic data comparing Merlin Models. In particular, she compared DLRM with the JoC DLRM TF implementation, whose experiments results can be found in this spreadsheet (Nvidia internal only).
She noticed in particular that MM implementation uses TF embedding API functions, while JoC uses a custom joint embedding that fuses embedding tables together and performs embeddings jointly with one call [code link], which is faster
The text was updated successfully, but these errors were encountered:
@vysarge has done a number of benchmark and profiling experiments using using synthetic data comparing Merlin Models. In particular, she compared DLRM with the JoC DLRM TF implementation, whose experiments results can be found in this spreadsheet (Nvidia internal only).
She noticed in particular that MM implementation uses TF embedding API functions, while JoC uses a custom joint embedding that fuses embedding tables together and performs embeddings jointly with one call [code link], which is faster
The text was updated successfully, but these errors were encountered: