[DRAF][one-optimize] Optimize part of the transformer's attention-head #12918

BalyshevArtem · 2024-04-24T15:13:42Z

This draft introduces two new passes to optimize part of the transformer's attention-head: FuseMulWithFullyConnected and FuseStridedSlicesNegAsMul.

for issue: #12917

ONE-DCO-1.0-Signed-off-by: Artem Balyshev a.balyshev@samsung.com

This draft introduces two new passes to optimize part of the transformer's attention-head: FuseMulWithFullyConnected and FuseStridedSlicesNegAsMul. ONE-DCO-1.0-Signed-off-by: Artem Balyshev <a.balyshev@samsung.com>

seanshpark · 2024-04-24T22:41:51Z

compiler/luci/pass/src/FuseMulWithFullyConnectedPass.cpp:77:21: error: unused variable ‘fc_input’ [-Werror=unused-variable]

seanshpark · 2024-04-24T22:44:38Z

some comments;

please add some test models in res/TensorFlowLiteRecipes
and add to circle2circle-dredd-recipe-test
also to luci-pass-value-py-test

BalyshevArtem · 2024-05-02T13:02:01Z

please add some test models in res/TensorFlowLiteRecipes

and add to circle2circle-dredd-recipe-test

also to luci-pass-value-py-test

Done

BalyshevArtem · 2024-05-07T11:24:27Z

@seanshpark, can I split this draft into PRs?

seanshpark · 2024-05-07T21:31:02Z

can I split this draft into PRs?

Yes, please do :)
Actually I don't do detailed reviewed with draft PRs.

jinevening · 2024-05-08T02:38:29Z

res/TensorFlowLiteRecipes/Net_StridedSlices_Neg_000/test.recipe

+  input: "output_1"
+  input: "output_neg"


In the target model, the inputs are swapped. PTAL #12917 (comment)

BalyshevArtem added the DRAFT A draft issue or PR for sharing one's current working status and discussion. label Apr 24, 2024

BalyshevArtem marked this pull request as draft April 24, 2024 15:56

[DRAF][one-optimize] Optimize part of the transformer's attention-head

8974450

This draft introduces two new passes to optimize part of the transformer's attention-head: FuseMulWithFullyConnected and FuseStridedSlicesNegAsMul. ONE-DCO-1.0-Signed-off-by: Artem Balyshev <a.balyshev@samsung.com>

BalyshevArtem force-pushed the opt_trans_1 branch from 75b73de to 8974450 Compare April 24, 2024 15:58

add recipes and tests

fd5b254

BalyshevArtem force-pushed the opt_trans_1 branch from faf419e to fd5b254 Compare April 25, 2024 11:28

Artem Balyshev added 3 commits May 2, 2024 12:28

fix name

a342946

fix fuse fc mul pass

8821352

fix assert

7af6c41

jinevening reviewed May 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAF][one-optimize] Optimize part of the transformer's attention-head #12918

[DRAF][one-optimize] Optimize part of the transformer's attention-head #12918

BalyshevArtem commented Apr 24, 2024

seanshpark commented Apr 24, 2024

seanshpark commented Apr 24, 2024

BalyshevArtem commented May 2, 2024

BalyshevArtem commented May 7, 2024

seanshpark commented May 7, 2024

jinevening May 8, 2024

[DRAF][one-optimize] Optimize part of the transformer's attention-head #12918

Are you sure you want to change the base?

[DRAF][one-optimize] Optimize part of the transformer's attention-head #12918

Conversation

BalyshevArtem commented Apr 24, 2024

seanshpark commented Apr 24, 2024

seanshpark commented Apr 24, 2024

BalyshevArtem commented May 2, 2024

BalyshevArtem commented May 7, 2024

seanshpark commented May 7, 2024

jinevening May 8, 2024

Choose a reason for hiding this comment