Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too big for AWS Lambda layer #9

Open
rmontroy opened this issue Aug 19, 2021 · 6 comments
Open

Too big for AWS Lambda layer #9

rmontroy opened this issue Aug 19, 2021 · 6 comments
Assignees
Labels
help wanted Extra attention is needed workaround-exists Issue contains a working fix

Comments

@rmontroy
Copy link

Is there any way to trim down the dependencies? The total uncompressed size of a deployment package in AWS Lambda can't be more than 250MB. The total size of tiffslide plus dependencies is 195MB, which means I'm over the limit when I add other things (e.g. s3fs).

@ap--
Copy link
Collaborator

ap-- commented Aug 19, 2021

hmmm, good question...

...
888K	./urllib3
1,1M	./pkg_resources
1,2M	./tifffile
1,4M	./yarl
1,4M	./zarr
1,6M	./chardet
2,8M	./PIL
3,8M	./setuptools
7,5M	./aiohttp
7,7M	./Pillow.libs
11M	./pip
26M	./numcodecs
30M	./numpy
33M	./numpy.libs
38M	./imagecodecs.libs
60M	./botocore
66M	./imagecodecs
295M	.

It seems like imagecodecs is the biggest offender with ~104MB total. And when you install s3fs, botocore is another 60MB.
I'll have to think about it a bit. Basically all these dependencies are pulled in via tifffile.

@ap--
Copy link
Collaborator

ap-- commented Aug 22, 2021

Hi @rmontroy

So the way to go seems to be to install only what you require to decode the svs images you want to process.
imagecodecs can be built with a subset of supported formats by skipping everything else:

For example on Ubuntu 20.04:

### Disabled by default
# --global-option="--skip-avif" \
# --global-option="--skip-brunsli" \
# --global-option="--skip-jpegls" \
# --global-option="--skip-jpegxl" \
# --global-option="--skip-lerc" \
# --global-option="--skip-lz4f" \
# --global-option="--skip-zfp" \
# --global-option="--skip-zlibng" \

### Required for the tiffslide tests to pass
# --global-option="--skip-shared" \
# --global-option="--skip-imcd" \
# --global-option="--skip-jpeg8" \

python -m pip install imagecodecs \
  --global-option="build_ext" \
  --global-option="--skip-aec" \
  --global-option="--skip-bitshuffle" \
  --global-option="--skip-blosc" \
  --global-option="--skip-brotli" \
  --global-option="--skip-bz2" \
  --global-option="--skip-deflate" \
  --global-option="--skip-gif" \
  --global-option="--skip-jpeg2k" \
  --global-option="--skip-jpegxr" \
  --global-option="--skip-lz4" \
  --global-option="--skip-lzf" \
  --global-option="--skip-lzma" \
  --global-option="--skip-pglz" \
  --global-option="--skip-png" \
  --global-option="--skip-rcomp" \
  --global-option="--skip-snappy" \
  --global-option="--skip-tiff" \
  --global-option="--skip-webp" \
  --global-option="--skip-zlib" \
  --global-option="--skip-zopfli" \
  --global-option="--skip-zstd"

The above command installs a version imagecodecs that only has the required formats to make the tiffslide tests pass. It could be that when you test with an svs of yours, an error will be raised and you need to rebuild imagecodecs and leave out the specific skip option to prevent that error for your file (i.e. maybe jpeg2k or so...).

Here is a link to the relevant instructions in imagecodecs: https://github.com/cgohlke/imagecodecs/blob/e92cef6c1878f0b69ebbfb33f9bb809eccbdc31e/imagecodecs/imagecodecs.py#L160-L181 You might have to install the system build dependencies if the build fails for you.

I hope that helps. Let me know how things go.

Cheers,
Andreas 😃

PS.: It could be that we would need to build a manylinux wheel to make it work since we're linking against os libraries.
PSS.: Another option could be to manually remove the unneeded *.so files in the site-packages/imagecodecs folder before you create you lambda layer.

@cgohlke
Copy link

cgohlke commented Aug 22, 2021

Another option: unpack the imagecodecs wheel, remove unneeded *.so files, repack the wheel, and install it. See https://wheel.readthedocs.io/en/stable/reference/wheel_pack.html#examples

@rmontroy
Copy link
Author

rmontroy commented Sep 2, 2021

@ap-- Any interest in creating a public AWS Lambda layer that's as small as possible? I don't have the time to look into it now, but I'd use it if it already existed, provided it performed well enough.

@ap--
Copy link
Collaborator

ap-- commented Sep 15, 2021

I'll have a look. I might have some time next week to make the layer or at least an easy way to create a minimal venv.

@ap-- ap-- self-assigned this Feb 9, 2022
@ap-- ap-- added help wanted Extra attention is needed workaround-exists Issue contains a working fix labels Jul 7, 2022
@swamidass
Copy link

FYI, there is a straightforward solution, using the serverless framework and this plugin:

https://www.serverless.com/plugins/serverless-python-requirements

Key configuration is to add:

custom:
  pythonRequirements:
    zip: true

And to include this in your handler:

try:
  import unzip_requirements
except ImportError:
  pass

This zips all the requirements (including tiffslide) so that it falls below the cutoff by a wide margin. This does add a second or two to cold starts, but there is no cost for warm starts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed workaround-exists Issue contains a working fix
Projects
None yet
Development

No branches or pull requests

4 participants