Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACL operators need to be made stateless to avoid runtime initialization overhead #1085

Open
snadampal opened this issue Dec 19, 2023 · 2 comments

Comments

@snadampal
Copy link

Output of 'strings libarm_compute.so | grep arm_compute_version':
arm_compute_version=v23.11 Build options: {'Werror': '0', 'debug': '0', 'neon': '1', 'opencl': '0', 'embed_kernels': '0', 'os': 'linux', 'arch': 'armv8a', 'build': 'native', 'multi_isa': '1', 'fixed_format_kernels': '1', 'openmp': '1', 'cppthreads': '0'} Git hash=b'add70ace1e57f65d1ae4d0cedaec6e4578cf87ff'

Platform:
AWS c7g.16xl

Operating System:
Ubuntu 22.04

Problem description:
One of the important optimizations for better inference performance is to cut down the kernel initialization overhead. This can be achieved by caching the operator after first time initialization and reuse it across similar tensor shapes. Today it's not possible to cache ACL operator because they maintain the workspace state along with the initialization and the workspace is specific to the gemm operation.
The requirement is to make the operators stateless so that they get initialized once and reused across multiple gemm operations of the same shapes.
more details are in this oneDNN discussion: oneapi-src/oneDNN#1455 (comment)

@morgolock
Copy link

Hi @snadampal

Thanks for raising this. We will discuss the feature request with the team.

@milpuz01
Copy link

milpuz01 commented Feb 5, 2024

We are about to start the exploratory work for 24.05 to integrate two oneDNN primitives convolution and matrix multiplication to use existing non-public ACL API for using stateless object in order to better understand what (if any) requirements are necessary on ACL side. Once that work is done we plan to address any changes in ACL for 24.08 release and port the rest of oneDNN primitives. We expect the work to be done by 24.11.

As the most of the work that will start will initially be oneDNN specific we will link to this issue PRs from oneDNN to track progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants