New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compressing the discovery cache #2321
Comments
PRs are welcome! We should ensure that when we retrieve the discovery artifact from the compressed file using google-api-python-client/googleapiclient/discovery_cache/__init__.py Lines 55 to 78 in c965b05
|
This also significantly impacts the container startup time (download/extract/run cycle). A better option can be to allow bundling of required json schema's only (instead of dumping every json spec in the container). Example, in our case we only use PubSub publisher client and GCS bucket APIs. |
How about putting compressed JSON schemas in the distributed package and have them to be decompressed and stored only after first usage? It could save space and be more efficient when requiring a specific API (since it will be decompressed and thus easier to read after first usage). |
#2315 shaved off a bit more than 20MB of the discovery cache, by removing the json indentation.
Currently there's still 74.6MB remaining, and it appears to be growing steadily over time (from 81MB on 2023-03-31 to 94MB on 2024-01-15).
So while #2315 definitely helped, I believe that it's a good idea to consider reducing the size even further. I believe that this could significantly improve build/deploy times of the many docker images that use this library. For reference, the current latest
python-slim
docker image is less than 50MB.An easy win would be to use compression.
To illustrate: if I manually zip the entire
documents
directory of 74.6MB in v2.114.0 using the pop-os default gnome archive manager, I end up with an archive of 11.9MB.Creating a
documents.tar.xz
in the same way makes this 4.6MB.It should be possible to achieve similar levels of compression by using Python's standard compression libraries, e.g.
zlib
,zipfile
, orlzma
.The text was updated successfully, but these errors were encountered: