rfc: propose split op for oneDNN Graph API #1628

xiang1guo · 2023-04-14T06:58:19Z

This RFC is to add a new Split operation to oneDNN graph API opset.

The rendered version: link

xiang1guo · 2023-04-24T02:47:50Z

Can you help to review this RFC? Thanks! @igorsafo @mgouicem @dzarukin

chunyuan-w · 2023-04-25T15:14:39Z

PyTorch split OP has a split_size_or_sections input:

When it is list, it is aligned with the size_splits attr here.
When it is an int, it seems that there's no correspondent attr:
- PyTorch int split_size_or_sections:
  - It is the size of the chunk and it denotes that the input will be split into equally sized chunks (if possible).
  - Last chunk will be smaller if the tensor size along the given dimension dim is not divisible by split_size.
- num_splits in oneDNN graph:
  - It is the number of chunks
  - The dimension of input tensor data shape along axis must be evenly divisible by num_splits attribute.

In static shape cases, if the input shape is known, the integration could try to convert int split_size_or_sections to num_splits in case the dimension could be evenly divisible by num_splits. In dynamic shape cases, I feel like it will be hard for the integration to map the OP to oneDNN graph.

For PyTorch, each of the output chunk is a view of the original tensor. If it is not the case in oneDNN graph, may I know if this will be mentioned anywhere in the document?

sanchitintel · 2023-04-26T02:38:45Z

@xiang1guo, we plan to emulate dynamic shape support in IPEX with multiple compiled partitions for various static shapes.
Since the num_splits attribute is to be provided at the time of graph creation, IPEX would provide its value for the first input but then it might change for a later input! :(

cc @chunyuan-w

xiang1guo · 2023-04-26T05:07:07Z

PyTorch split OP has a split_size_or_sections input:

When it is list, it is aligned with the size_splits attr here.

When it is an int, it seems that there's no correspondent attr:

PyTorch int split_size_or_sections:

It is the size of the chunk and it denotes that the input will be split into equally sized chunks (if possible).

Last chunk will be smaller if the tensor size along the given dimension dim is not divisible by split_size.

num_splits in oneDNN graph:

It is the number of chunks

The dimension of input tensor data shape along axis must be evenly divisible by num_splits attribute.

In static shape cases, if the input shape is known, the integration could try to convert int split_size_or_sections to num_splits in case the dimension could be evenly divisible by num_splits. In dynamic shape cases, I feel like it will be hard for the integration to map the OP to oneDNN graph.

For PyTorch, each of the output chunk is a view of the original tensor. If it is not the case in oneDNN graph, may I know if this will be mentioned anywhere in the document?

For 1, as you mentioned in the meeting, we can add another attribute to align the meaning of Pytorch. It helps for mapping, but it still just can't handle the situation when the axis dimension of the input tensor is dynamic. @ZhennanQin @yifeizh2 Do you have any comments?

For 2, yes, graph API doesn't guarantee the view or copy. I will add more descriptions for it. Thanks for your reminder!

ZhennanQin · 2023-04-26T05:15:22Z

For 1, as you mentioned in the meeting, we can add another attribute to align the meaning of Pytorch. It helps for mapping, but it still just can't handle the situation when the axis dimension of the input tensor is dynamic.

What do you mean by the axis dimension of the input tensor is dynamic? Can you provide an example for better understanding?

xiang1guo · 2023-04-26T05:42:41Z

For 1, as you mentioned in the meeting, we can add another attribute to align the meaning of Pytorch. It helps for mapping, but it still just can't handle the situation when the axis dimension of the input tensor is dynamic.

What do you mean by the axis dimension of the input tensor is dynamic? Can you provide an example for better understanding?

For dynamic input data [1, 28, ?](? means this dimension can be gotten in the execution stage) and the axis=2.

ZhennanQin · 2023-04-26T05:46:35Z

For 1, as you mentioned in the meeting, we can add another attribute to align the meaning of Pytorch. It helps for mapping, but it still just can't handle the situation when the axis dimension of the input tensor is dynamic.

What do you mean by the axis dimension of the input tensor is dynamic? Can you provide an example for better understanding?

For dynamic input data [1, 28, ?](? means this dimension can be gotten in the execution stage) and the axis=2.

I see. Thanks for the example. In this case, the number of outputs from split is dynamic, how's this represented in the framework graph?

xiang1guo · 2023-04-26T05:46:55Z

xiang1guo · 2023-04-26T05:49:53Z

For 1, as you mentioned in the meeting, we can add another attribute to align the meaning of Pytorch. It helps for mapping, but it still just can't handle the situation when the axis dimension of the input tensor is dynamic.

What do you mean by the axis dimension of the input tensor is dynamic? Can you provide an example for better understanding?

For dynamic input data [1, 28, ?](? means this dimension can be gotten in the execution stage) and the axis=2.

I see. Thanks for the example. In this case, the number of outputs from split is dynamic, how's this represented in the framework graph?

Seems for now we don't have any real case. @chunyuan-w @sanchitintel Do you have any real case about it?

wuxun-zhang · 2023-04-26T06:01:59Z

@xiang1guo, we plan to emulate dynamic shape support in IPEX with multiple compiled partitions for various static shapes. Since the num_splits attribute is to be provided at the time of graph creation, IPEX would provide its value for the first input but then it might change for a later input! :(

cc @chunyuan-w

Hi @sanchitintel, by saying then it might change for a later input, do you mean that num_splits will be no longer an attribute? How is it handled in bridge?

sanchitintel · 2023-04-27T02:34:25Z

by saying then it might change for a later input, do you mean that num_splits will be no longer an attribute? How is it handled in bridge?

Hi @wuxun-zhang, if the input-size would change, then num_splits may change as well, but since the framework would have passed a value of num_splits corresponding to the first input encountered at the time of graph creation, I think we would run into a shape-mismatch issue at runtime.

So, it'd be great if we could pass this value at compile time instead. IIRC, something similar is done for DynamicReshape as well.

sanchitintel · 2023-04-27T02:49:11Z

In this case, the number of outputs from split is dynamic, how's this represented in the framework graph?

@ZhennanQin, currently, the IPEX/PyTorch JIT framework graph is created by tracing with a sample input, so the framework graph's nodes are specific to that input, so, for the case you mentioned, another input with a different input-shape would have generated a different graph.

So, if we were to lower such an op (that would generate a different number of tensors) to oneDNN Graph, then we would encounter a runtime error with an input tensor of a different shape than profiled. (However, if we wouldn't lower such an op to oneDNN Graph, then those ops would be handled by Pytorch/IPEX eager mode.)

@frost-intel @chunyuan-w, do you know how Dynamo handles such cases? Thanks!

chunyuan-w · 2023-04-27T13:02:43Z

In this case, the number of outputs from split is dynamic, how's this represented in the framework graph?

@ZhennanQin, currently, the IPEX/PyTorch JIT framework graph is created by tracing with a sample input, so the framework graph's nodes are specific to that input, so, for the case you mentioned, another input with a different input-shape would have generated a different graph.

So, if we were to lower such an op (that would generate a different number of tensors) to oneDNN Graph, then we would encounter a runtime error with an input tensor of a different shape than profiled. (However, if we wouldn't lower such an op to oneDNN Graph, then those ops would be handled by Pytorch/IPEX eager mode.)

@frost-intel @chunyuan-w, do you know how Dynamo handles such cases? Thanks!

I would suggest adding an attribute split_size to match the int type split_size_or_sections in PyTorch.

When the TorchScript IR indicates that this int split_size_or_sections is a Constant, the bridge will set the split_size attr of Split OP to be the Constant value that is known during compilation time.
If it is not a Constant according to the TorchScript IR, we're not able to map it to the static attr. Another runtime representation of split_size will be needed. A bit similar to the discussion on the exponent of the Pow OP.

frost-intel · 2023-04-27T14:34:37Z

@frost-intel @chunyuan-w, do you know how Dynamo handles such cases? Thanks!

Dynamo can either use a constant input size similar to PT JIT, or it can have a dynamic size marked as a SymInt object with undetermined size. We haven't explored dynamic sizes in our Dynamo/oneDNN Graph bridge, but an attribute or flag marking a dimension as Dynamic would likely be required.

sanchitintel · 2023-04-27T17:42:26Z

If it is not a Constant according to the TorchScript IR, we're not able to map it to the static attr. Another runtime representation of split_size will be needed. A bit similar to the discussion on the exponent of the Pow OP.

Thanks, @chunyuan-w! If split_size_or_sections varies for different input-sizes, then it'd not be constant in TorchScript IR, but would be generated with a combination of ops including aten::size. So, I agree with your proposal, except that in the multiple-compiled-partitions-for-different-static-input-shapes implementation, split_size would also be available at compile time (as opposed to run-time).

Thanks for your inputs, @frost-intel!

rfcs/20230323-add-split-graph-op/README.md

xiang1guo · 2023-05-06T09:48:20Z

@chunyuan-w @sanchitintel @retonym @ZhennanQin @yifeizh2 I just updated the RFC and added an option2 per pytorch team's request. Please help to review the option2 again, if confirmed, our final implementation will be based on this option.

ZhennanQin · 2023-05-06T09:57:21Z

rfcs/20230323-add-split-graph-op/README.md

+| input     | `src`         | input data                                                                             | Required             | f32,f16,bf16* |
+| attribute | `axis`        | Specifies which dimension to split along.                                              | Required             | s64           |
+| attribute | `num_splits`  | number of outputs into which the input tensor data will be split along axis dimension. | optional             | s64           |
+| attribute | `size_split`  | size for single output tensor along axis dimension.                                    | optional             | s64           |


during graph build stage, what if size_split is specified while input shape is unknown? How to determine the output number?

That case you mentioned is a dynamic case that is not supported and has not been encountered yet. Once the dynamic shape feature lands in the future release, we can take it into consideration. cc @wuxun-zhang .
The option2 here is for the convenience of integration on the IPEX side which can simply map the size_split_or_sections into size_split during the build stage.

I think it's not a dynamic shape case. FWK won't provide concrete shapes during the graph build stage even for models with static shapes.

during graph build stage, what if size_split is specified while input shape is unknown? How to determine the output number?

Why need to determine the output number at graph build stage?

Why need to determine the output number at graph build stage?

If you don't know the output number, how to build the graph after this op? A simple example is split+concat, how to collect all the outputs from split and feed them into contact op?

I see, it's related to how bridge pass framework's graph to oneDNN Graph. We may need framework team to help answer, cc @chunyuan-w @sanchitintel @retonym.

Do I understand correctly that this is a case where the graph structure depends on the input shapes? In the current integration, the graph will only be built once. I feel like it does not work anymore in such cases. Let me check with @jgong5 & @sanchitintel on it.

@chunyuan-w, I agree that this case would encounter a runtime error with our multiple compiled partitions per partition approach. I think we can't solve this problem without re-creating the graph.

rfcs/20230323-add-split-graph-op/README.md

Jianhui-Li · 2023-05-09T04:54:18Z

rfcs/20230323-add-split-graph-op/README.md

+
+#### operations used in Frameworks and toolkit
+
+| Tensorflow | Argument Name         | Description                                                                                       |


As TF workloads moves to XLA compiler path, and Pytorch move to Torchdynamic/TorchInductor, FW operations are decomposed to XLA HLO op or Pytorch PrimTorch OP. These operations usually has simpler semantics. oneDNN Graph operation should first consider what is the basic operation we need to support, and usually these operation design would give better insight - what operations are popular (those unpopular ones may not lowered to these simple ops), and how the basic operations look like.

Could you please also include how these two set operations support split?

Hi, @Jianhui-Li, just checked the XLA and PrimTorch, these two operations don't directly support Split.

TensorFlow's Split will be lowered to XLA HLO Slice op. I have updated the definition of XLA HLO Slice in this section.

For Pytorch/PrimTorch, when Torch IR is lowered to Aten IR/Prim IR, the torch.Split will call aten.Split and there is no decomposition to PrimTorch ops happening during this stage. And inductor will lower aten.Split to slice in its implementation.
cc @retonym @chunyuan-w

yifeizh2 approved these changes Apr 14, 2023

View reviewed changes

retonym approved these changes Apr 17, 2023

View reviewed changes

igorsafo added the RFC A design document label Apr 24, 2023

xiang1guo closed this Apr 26, 2023

xiang1guo reopened this Apr 26, 2023

TaoLv reviewed May 4, 2023

View reviewed changes

rfcs/20230323-add-split-graph-op/README.md Show resolved Hide resolved

xiang1guo force-pushed the xiang/rfcs/graph-split-op branch 2 times, most recently from a8d4b09 to 4d27c53 Compare May 6, 2023 06:54

ZhennanQin reviewed May 6, 2023

View reviewed changes

yifeizh2 reviewed May 6, 2023

View reviewed changes

rfcs/20230323-add-split-graph-op/README.md Show resolved Hide resolved

wuxun-zhang reviewed May 8, 2023

View reviewed changes

rfcs/20230323-add-split-graph-op/README.md Outdated Show resolved Hide resolved

Jianhui-Li reviewed May 9, 2023

View reviewed changes

rfc: propose split op for graph API

6f09b5b

xiang1guo force-pushed the xiang/rfcs/graph-split-op branch from 4d27c53 to 6f09b5b Compare May 10, 2023 06:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rfc: propose split op for oneDNN Graph API #1628

rfc: propose split op for oneDNN Graph API #1628

xiang1guo commented Apr 14, 2023

xiang1guo commented Apr 24, 2023

chunyuan-w commented Apr 25, 2023

sanchitintel commented Apr 26, 2023

xiang1guo commented Apr 26, 2023 •

edited

ZhennanQin commented Apr 26, 2023

xiang1guo commented Apr 26, 2023

ZhennanQin commented Apr 26, 2023

xiang1guo commented Apr 26, 2023

xiang1guo commented Apr 26, 2023

wuxun-zhang commented Apr 26, 2023

sanchitintel commented Apr 27, 2023

sanchitintel commented Apr 27, 2023 •

edited

chunyuan-w commented Apr 27, 2023 •

edited

frost-intel commented Apr 27, 2023

sanchitintel commented Apr 27, 2023

xiang1guo commented May 6, 2023 •

edited

ZhennanQin May 6, 2023

xiang1guo May 8, 2023 •

edited

ZhennanQin May 8, 2023

wuxun-zhang May 8, 2023

ZhennanQin May 8, 2023

wuxun-zhang May 8, 2023

chunyuan-w May 8, 2023

sanchitintel May 8, 2023

Jianhui-Li May 9, 2023

xiang1guo May 10, 2023 •

edited


		#### operations used in Frameworks and toolkit

		\| Tensorflow \| Argument Name \| Description \|

rfc: propose split op for oneDNN Graph API #1628

Are you sure you want to change the base?

rfc: propose split op for oneDNN Graph API #1628

Conversation

xiang1guo commented Apr 14, 2023

xiang1guo commented Apr 24, 2023

chunyuan-w commented Apr 25, 2023

sanchitintel commented Apr 26, 2023

xiang1guo commented Apr 26, 2023 • edited

ZhennanQin commented Apr 26, 2023

xiang1guo commented Apr 26, 2023

ZhennanQin commented Apr 26, 2023

xiang1guo commented Apr 26, 2023

xiang1guo commented Apr 26, 2023

wuxun-zhang commented Apr 26, 2023

sanchitintel commented Apr 27, 2023

sanchitintel commented Apr 27, 2023 • edited

chunyuan-w commented Apr 27, 2023 • edited

frost-intel commented Apr 27, 2023

sanchitintel commented Apr 27, 2023

xiang1guo commented May 6, 2023 • edited

Choose a reason for hiding this comment

xiang1guo May 8, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiang1guo May 10, 2023 • edited

Choose a reason for hiding this comment

xiang1guo commented Apr 26, 2023 •

edited

sanchitintel commented Apr 27, 2023 •

edited

chunyuan-w commented Apr 27, 2023 •

edited

xiang1guo commented May 6, 2023 •

edited

xiang1guo May 8, 2023 •

edited

xiang1guo May 10, 2023 •

edited