Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to minimize heap/stack memory usage of multiple contexts on mobile platform? #461

Open
NoSW opened this issue Apr 11, 2024 · 3 comments
Labels

Comments

@NoSW
Copy link

NoSW commented Apr 11, 2024

I'm using astcenc-encoder to compress volumetric lightmap data on mobile platform. I have selected two block sizes, 5x5 and 5x5x5, based on certain criteria.

When decompressing, it is required to create two different contexts simultaneously, or even four contexts if considering HDR/LDR. This results in a memory allocation of 30MB to 40MB, which is challenging to accept on mobile devices, especially on iOS.

I have noticed the build option ASTCENC_BLOCK_MAX_TEXELS, but the limit of 5x5x5=125 is still too large.

Q1: Is there any possibility to merge contexts with different block sizes and HDR/LDR settings into a big one?

Q2: Starting from #246, are there any opportunities to further reduce memory overhead?

(I have noticed WEIGHTS_MAX_DECIMATION_MODES, WEIGHTS_MAX_BLOCK_MODES, and BLOCK_MAX_WEIGHTS are still constant in v4.7.0.)

Q3: Can ASTCENC_BLOCK_MAX_TEXELS be changed to (c++) template?

@solidpixel
Copy link
Contributor

solidpixel commented Apr 11, 2024

Why do you need to decompress? Surely the point is to select formats the GPU can access natively in hardware.

Q1: Is there any possibility to merge contexts with different block sizes and HDR/LDR settings into a big one?

HDR is a superset of LDR, so you can use a HDR context to decompress LDR images already.

Merging block sizes won't help - you just make the context linearly bigger.

Q2: Starting from #246, are there any opportunities to further reduce memory overhead?

Probably. PR's welcome ...

Q3: Can ASTCENC_BLOCK_MAX_TEXELS be changed to (c++) template?

No - it gets used by the preprocessor.

@NoSW
Copy link
Author

NoSW commented Apr 11, 2024

Why do you need to decompress?

Not all GPUs implementing ASTC support the HDR profile.

Do you mean I should choose the format supported by the GPU rather than the specific astc block size?

Since the data is a Texture3DArray with dimensions of 5x5x5, there is no spatial continuity between them. So some formats may not be quite suitable, such as the 4x4 block size of BCn/ETC2. Therefore, I have chosen the astc 5x5 and 5x5x5 with a CPU decompressor.

HDR is a superset of LDR, so you can use a HDR context to decompress LDR images already.

👍

No - it gets used by the preprocessor.

Templates can be helpful in creating a 5x5 context with a smaller memory footprint if the ASTCENC_BLOCK_MAX_TEXELS=5x5x5 build option is enabled. However, it does not adhere to the API style of this library :(

@solidpixel
Copy link
Contributor

Templates can be helpful in creating a 5x5 context with a smaller memory footprint

Using templated structures would change the size of the structs used in the context, so you'd need to build N templated versions of the codec, one per block size, so the code size would jump for the N combinations.

The bulk of the memory comes from the decimation tables, and then the partition tables. If you know which decimation modes and partitionings your textures actually use, the fastest way to reduce memory footprint is to filter out the creation of the entries you don't need at context creation time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants