You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have encountered numerous 'record_param_comms' nodes in Chakra ET, which serve as child nodes to collective communication nodes. I presume that these functions are intended to log communication information, such as the communication domain for collective communications, the counterpart in point-to-point communications, the size of the communication volume, and other parameters. However, this is just my speculation, as I have not been able to find specific invocations of these functions within PyTorch. How is this information utilized within Chakra?
The text was updated successfully, but these errors were encountered:
As you have mentioned, record_param_comms functions as a parent node of a collective communication node. In Chakra, we do not use record_param_comms directly. We can easily identify communication nodes with record_param_comms. Moreover, it serves as a bridge between the actual collective communication node and its parent. We plan to remove record_param_comms from traces by updating the PyTorch profiler.
I have encountered numerous 'record_param_comms' nodes in Chakra ET, which serve as child nodes to collective communication nodes. I presume that these functions are intended to log communication information, such as the communication domain for collective communications, the counterpart in point-to-point communications, the size of the communication volume, and other parameters. However, this is just my speculation, as I have not been able to find specific invocations of these functions within PyTorch. How is this information utilized within Chakra?
The text was updated successfully, but these errors were encountered: