Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The mat_mul_kernel_s16 function needs __PKHTB & __PKHBT to re-order the value #44

Open
CristXu opened this issue Mar 8, 2023 · 1 comment
Labels
improvement Performance or general improvement

Comments

@CristXu
Copy link

CristXu commented Mar 8, 2023

Hi,
I found that at below line:

ip_a0 = read_and_pad(ip_a0, &a01, &a02);

We use a read_and_pad to process the weights for the value expanding from q7_t to q15_t, and also a group: __PKHxx for
reording the value from (a0, a2, a1, a3) to (a0, a1, a2, a3).
My question is that why we add this two PKHxx operations, I think that We can still use the (a0, a2, a1, a3), if we process the
input with the same way (I found that the 1x1 conv2d has the similarity operation, without __PKHxx). So that we can save two-instructs and then save the inference time.

Regards,
Crist

@mansnils
Copy link
Contributor

mansnils commented Mar 8, 2023

Hi @CristXu ,
Thanks for your comments!
You are right that we do additional ordering in some places.
We are looking over this now and see where we can get rid of PKHTB/PKHBT.
Thanks,
Måns

@felix-johnny felix-johnny added the improvement Performance or general improvement label Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Performance or general improvement
Projects
None yet
Development

No branches or pull requests

3 participants