Feature vm compression #2977

nomennescio · 2024-05-08T12:38:44Z

Ready to merge

…ranch Similar to what build.sh does.

Typical MS braindead tool

For uncompressed images, compressed data- and code size are equal to data- and code size fields.

Still needs to copy compressed data into temporary buffer. No decompression error handling.

mrjbq7 · 2024-05-10T23:14:31Z

Those times don’t look right. I think your Factor is loading stuff on startup maybe your -rc file. Run factor, then ‘save’ then try the compare again. On May 10, 2024, at 2:34 PM, nomennescio ***@***.***> wrote: time ./factor -e="" Right, thanks for the feedback; I ran a single time on my Windows machine real 0m0.982s user 0m0.000s sys 0m0.015s /c/dev/project/build/factor [30104] $ time ./factor.com -i=factor.image -e="" real 0m0.934s user 0m0.000s sys 0m0.000s so about 50 ms on top of 930 ms. Can't test the Linux times. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because your review was requested.Message ID: ***@***.***>

nomennescio · 2024-05-11T02:16:06Z

Those times don’t look right. I think your Factor is loading stuff on startup maybe your -rc file. Run factor, then ‘save’ then try the compare again.

Yes, it does a USE: on an editor, to set that variable. I think that's quite typical use, but nothing else.

But I don't understand why you are so focusing on 50 ms for decompressing. I don't think it's relevant at all, as it's a totally reasonable amount of time to decompress into 130 MB, you cannot avoid it if you want decompression, and it's purely optional to begin with, you still have 0 ms added load time if you just use uncompressed images. What's your concern here?

But I add the timing anyway: Freshly build, started, created boot image, booted, save, USE: binary.image.factor.compressor, then compress-current-image

$ time ./factor.com -i=factor.image -e=""

real    0m0.281s
user    0m0.015s
sys     0m0.015s
$ time ./factor.com -i=factor.image.compressed -e=""

real    0m0.330s
user    0m0.000s
sys     0m0.031s

mrjbq7 · 2024-05-11T02:24:33Z

Well for me, if it was 50 millis it’s a 50% performance reduction. But in your example it’s like 15% which if it held true to my computer is only 20 millis. It matters for some use cases like CGI or short lived command line programs, and I’m already frustrated we take 100 millis to mmap the image file and I want to make it faster, not slower.

…

On Fri, May 10, 2024 at 7:16 PM nomennescio ***@***.***> wrote: Those times don’t look right. I think your Factor is loading stuff on startup maybe your -rc file. Run factor, then ‘save’ then try the compare again. Yes, it does a USE: on an editor, to set that variable. I think that's quite typical use, but nothing else. But I don't understand why you are so focusing on 50 ms for decompressing. I don't think it's relevant at all, as it's a totally reasonable amount of time to decompress into 130 MB, you cannot avoid it if you want decompression, and it's purely optional to begin with, you still have 0 ms added load time if you just use uncompressed images. What's your concern here? But I add the timing anyway: Freshly build, started, created boot image, booted, save, USE: binary.image.factor.compressor, then compress-current-image $ time ./factor.com -i=factor.image -e="" real 0m0.281s user 0m0.015s sys 0m0.015s $ time ./factor.com -i=factor.image.compressed -e="" real 0m0.330s user 0m0.000s sys 0m0.031s — Reply to this email directly, view it on GitHub <#2977 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAF5AZNEIZ4TT6OL7HF7PLZBV5PZAVCNFSM6AAAAABHM4P6RCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBVGQ2TONBUGI> . You are receiving this because your review was requested.Message ID: ***@***.***>

nomennescio · 2024-05-11T02:49:39Z

Well for me, if it was 50 millis it’s a 50% performance reduction. But in your example it’s like 15% which if it held true to my computer is only 20 millis. It matters for some use cases like CGI or short lived command line programs, and I’m already frustrated we take 100 millis to mmap the image file and I want to make it faster, not slower.

I understand your concern, but it's not related to this feature; this is about adding optional compression support, which only has a very limited cost to startup if you use a compressed image.

As for using mmap, I doubt that will work for any typical code, or will work across platforms; once code starts executing, pages will get read from disk, the additional time taken will just be delayed. For me that is however a different discussion from what this patch tries to achieve. If you want to work on reduction in load time, is it not better to open a dedicated issue to that? We can exchange ideas and work on it there.

nomennescio · 2024-05-11T04:19:41Z

Some more statistics in #2577

nomennescio · 2024-05-12T06:18:17Z

I'll also be doing a compression experiment with a load-all image; if we can practically start using that, then run-time loading of vocabs can be prevented at all.

Got some issues with load-all on Windows, but nevertheless:
-rwxrwxr-x+ 1  129890744 May  8 23:57 factor.image
-rwxrwx---+ 1  571727240 May 10 15:18 factor-all.image
-rwxrwx---+ 1   78380047 May 10 15:28 factor-all.image.compressed
i.e. the compressed image for a load-all, is smaller than a normal uncompressed image. Its compression factor is 7.3.

With maximum compression that goes further down to:

-rwxrwx---+ 1   71276368 May 12 08:15 factor-all.image.compressed.22

a compression factor of 8.0.

nomennescio · 2024-05-13T10:46:34Z

@mrjbq7 do you have any objections if I put this already in the master branch? We can always adapt things later.

mrjbq7 · 2024-05-13T15:38:35Z

Well, a few things need to be addressed I guess if you want to eventually merge it:

Where does zstd.c come from? I don't see it in Facebook's release or source files.
I don't generally like merge commits, and not ones with conflicts because then we have to figure out if code was appropriately de-conflicted...
I don't really like the binary.image.factor.compressor vocab name, maybe tools.image-compressor or something.
I'm not sure I like this approach, and it adds a lot of overhead -- why aren't we just compressing the whole file, instead of sections of it. what benefit does that give us?
what use cases do we really care about file size anyway, and have you looked to see what the bulk of the image files size is taken up by anyway? maybe there are other approaches to reducing file size
does the VM look for the compressed file or does this only work with -i=factor.image.compressed
does this work in a deployed image?
It's a lot slower than uncompressed, we need to speed up factor startup not slow it down

$ time ./factor -i=factor.image -e=""  
./factor -i=factor.image -e=""  0.10s user 0.02s system 71% cpu 0.170 total

$ time ./factor -i=factor.image.z -e=""
./factor -i=factor.image.z -e=""  0.18s user 0.04s system 90% cpu 0.235 total

are there compression formats that would support random access and allow us to uncompress pages as needed instead of the whole image file? perhaps using an index generated at compression time

nomennescio · 2024-05-13T16:20:27Z

Well, a few things need to be addressed I guess if you want to eventually merge it:

Where does zstd.c come from? I don't see it in Facebook's release or source files.

It's in the header of zstd.c: generated with

 *	python combine.py -r ../../lib -x legacy/zstd_legacy.h -o zstd.c zstd-in.c

and 71671c mentions this is zstd release v1.5.6

I don't generally like merge commits, and not ones with conflicts because then we have to figure out if code was appropriately de-conflicted...

Alright, we can cherry-pick

I don't really like the binary.image.factor.compressor vocab name, maybe tools.image-compressor or something.

Not a problem; my thinking was to have all kinds of binary subvocabs dealing with binary formats, such as binary.elf, binary.coff, binary.pe etc., and add support packages for dealing with such blobs. A future idea might be intelligent parsing of all kinds of binary formats.

I'm not sure I like this approach, and it adds a lot of overhead -- why aren't we just compressing the whole file, instead of sections of it. what benefit does that give us?

What overhead are you referring to?

As for sections; the benefit is in no preloading stage, selective compression of only code or data (in case compression doesn't yield benefit), and a possibility to use ZSTD dictionaries tuned for code and for data.

what use cases do we really care about file size anyway, and have you looked to see what the bulk of the image files size is taken up by anyway? maybe there are other approaches to reducing file size

I care. It gives a reduction of 7-8 the image size at negligible cost. Other approaches do not conflict with this approach at all,

does the VM look for the compressed file or does this only work with -i=factor.image.compressed

No, the VM transparently handles this, independent of filename.

does this work in a deployed image?

If it uses the same loader, it should work. If not, it's the same as for a regular image. Such a feature can be added later,

It's a lot slower than uncompressed, we need to speed up factor startup not slow it down

$ time ./factor -i=factor.image -e=""  
./factor -i=factor.image -e=""  0.10s user 0.02s system 71% cpu 0.170 total

$ time ./factor -i=factor.image.z -e=""
./factor -i=factor.image.z -e=""  0.18s user 0.04s system 90% cpu 0.235 total

I disagree. If you care that much about startup speed you can still keep using an uncompressed image. Both are transparently supported at the same time. I'm willing to pay a very small cost of 50ms startup time to get 7-8x size reduction.

are there compression formats that would support random access and allow us to uncompress pages as needed instead of the whole image file? perhaps using an index generated at compression time

ZSTD is the most efficient, and most fast binary compression scheme, and is therefore used by many bootloaders.
I don't think we can improve upon that.

Theoretically an approach you mention is possible, but that can supported separately. I don't think it will bring much.

nomennescio · 2024-05-13T16:23:44Z

I think it's important that we're on the same page here; do you understand that compression/decompression is transparently done in the VM loader, which is determined by inspecting the header of an image file (not its name) and you don't get a performance penalty if you keep on using uncompressed images? And that by default Factor keeps on using uncompressed images? Which are not affected at all by this feature?

mrjbq7 · 2024-05-13T16:24:48Z

Yes, I also understand it's 1) a lot slower that uncompressed 2) not used by anyone by default 3) not tested with deploy 4) not necessarily the best approach to solving startup speed and 5) the solution to #4 might change this a lot

nomennescio · 2024-05-13T16:32:27Z

Yes, I also understand it's 1) a lot slower that uncompressed 2) not used by anyone by default 3) not tested with deploy 4) not necessarily the best approach to solving startup speed and 5) the solution to #4 might change this a lot

Compression and startup speed are totally unrelated issues. I'm not working on startup speed with this issue. This is about compression. Please make a separate issue out of startup speed. Compression is never going to help you with that. And improving startup speed is never going to help with compression.

I don't get it why you keep lumping these together.

As for 2. of course it's not used by anyone until we have such a feature...
As for 3. it should not impact deployment as normal images are still 100% supported

mrjbq7 · 2024-05-13T17:58:20Z

What is the point of implementing this if nothing ever uses it by default, if save doesn't save a compressed image, if deploy doesn't deploy a compressed image, if this is just a side feature with no tests and only complexity it will never be used.

The reason I lump this with startup performance is 1) it makes it slower and 2) it's entirely possible that either whole-file compression or block-based compression is faster and the latter would allow us to lazy uncompress segments, which we don't currently do and would completely change what you have done.

As an aside, I don't like the malloc in load_data_heap, if the tenured space is not initialized large enough, we should error out. The old code path made sure it was initialized large enough, why would it not be correct for the compressed path.

If you have RAM to hold a zlib uncompressed file, you certainly have hard drive space to hold an uncompressed image.

nomennescio · 2024-05-13T20:28:25Z

What is the point of implementing this if nothing ever uses it by default, if save doesn't save a compressed image, if deploy doesn't deploy a compressed image, if this is just a side feature with no tests and only complexity it will never be used.

I don't get you. What do you mean by "if nothing ever uses it by default"? If you want to compress an image, you compress it, and presto, it just works. The reason I introduce the current defaults is to not disturb existing usage. I prefer such an approach; only people who explicitly want it can use it, the others don't notice anything different. Especially since people like you don't want to add 50ms startup time to Factor, which is fine. From there we can work on the other scenarios you describe. Consider it an alpha test with features to be added.

However, this feature is 2 years in the making, and I don't want to postpone it another 2 years before doing all kind of intermediate steps and constantly merging in from master. It's a relatively small change which doesn't harm any future changes.

As for tests, it's not a problem to introduce tests.

The reason I lump this with startup performance is 1) it makes it slower and 2) it's entirely possible that either whole-file compression or block-based compression is faster and the latter would allow us to lazy uncompress segments, which we don't currently do and would completely change what you have done.

It wouldn't change a thing for making things slower, as it's optional. If we come up with a better feature in the future, that's fine by me. But currently we have no compression at all, and this feature is ready to be integrated to give compression. If you think you can make faster and better compression and uncompression, go ahead, but I won't hold my breath for that to happen. This feature won't block that any such improvements at all. To not introduce a feature because in the future we might find a a better solution goes against much of the used philosophy to first get things working, then improve upon it.

As an aside, I don't like the malloc in load_data_heap, if the tenured space is not initialized large enough, we should error out. The old code path made sure it was initialized large enough, why would it not be correct for the compressed path.

That's fine if you want to error out, I can implement it like that. The extra space is needed to load the compressed image and have room for an uncompressed image. That's the way decompression works. The malloc is only a fallback mechanism in case the image doesn't fit in the preallocated space, but typically isn't needed.

And b.t.w, the old code path doesn't make sure it's initialized large enough, it just picks fixed sizes in which Factor images currently just happen to fit. And the exact same old code path is still in place for uncompressed images, it's 100% backwards compatible.

If you have RAM to hold a zlib uncompressed file, you certainly have hard drive space to hold an uncompressed image.

That's certainly untrue, because it depends on personal use cases.

I have limited time to work on Factor, so I don't want this to drag for another 2 years, that's a waste of my time, and I find that very frustrating. It also doesn't exactly stimulate me to keep on contributing, and I'm afraid that has happened to other contributors in the past, who stopped contributing and left the community. I'm open to accommodate to improvements or changes you would like to see for this feature, but I don't think you've given arguments to not introduce this other than basically that you wouldn't use it. I don't think this feature is going to hurt Factor in any way imaginable, nor does it impact any future activity on "better compression/better startup times", but gives a nice feature to people who want to save disk space, of which I'm one.

I'm happy to also discuss things one-on-one to see if we maybe misunderstand each other.

mrjbq7 · 2024-05-13T20:38:00Z

Because if this is only so that you can load compressed images, it's not going to get merged.

It would get merged if:

compressed images are things people want to use
compressed images are supported by the tools
compressed images are saved/loaded appropriately by the tools
compressed images are natively supported on launch (e.g., looking for factor.image.z and if not found then factor.image), or perhaps that doesn't matter since the file embeds compression and a flag, but then save needs to change to save it compressed optionally
not just tests, but literally if this is a one-user feature then its not worth it, but if you paint a path of usefulness sure.

I don't want to slow startup, so then I think I'm unlikely to make compressed images a default, and I also don't want to miss an xz moment someday because now we have some complicated compression thing when it's not being used/tested enough.

I suppose you're saying why not merge it, then we can delete it later if we dislike it, well in that context i might be interested, but you've previously been wanting to be very careful about what public guarantees we make about things like the VM header flags and backwards compatibility, that makes me a lot less interested in trying things and then getting stuck supporting them.

mrjbq7 · 2024-05-13T20:53:54Z

And b.t.w, the old code path doesn't make sure it's initialized large enough, it just picks fixed sizes in which Factor images currently just happen to fit. And the exact same old code path is still in place for uncompressed images, it's 100% backwards compatible.

isn't h->data_size correct for uncompressed size? why wouldn't that buffer be big enough to uncompress into?

nomennescio · 2024-05-13T21:09:27Z

Because if this is only so that you can load compressed images, it's not going to get merged.

It would get merged if:

compressed images are things people want to use

You want to vote on that or did you have something else in mind? Can't remember we ever did that for another feature.

compressed images are supported by the tools

Which tools are you missing?

compressed images are saved/loaded appropriately by the tools

Which tools?

compressed images are natively supported on launch (e.g., looking for factor.image.z and if not found then factor.image), or perhaps that doesn't matter since the file embeds compression and a flag, but then save needs to change to save it compressed optionally

This is already as implemented. The only thing is that save currently does not call compress-image, but that's trivial to add. Just tell me how you like it.

not just tests, but literally if this is a one-user feature then its not worth it, but if you paint a path of usefulness sure.

Isn't this the same as 1.?

I don't want to slow startup, so then I think I'm unlikely to make compressed images a default, and I also don't want to miss an xz moment someday because now we have some complicated compression thing when it's not being used/tested enough.

I'm happy to thoroughly comment and review the source code and explain every nook and cranny.

I suppose you're saying why not merge it, then we can delete it later if we dislike it, well in that context i might be interested, but you've previously been wanting to be very careful about what public guarantees we make about things like the VM header flags and backwards compatibility, that makes me a lot less interested in trying things and then getting stuck supporting them.

I don't see a risk here; if at some point you want to drop VM support, you just remove it; all is needed is a utility that can read, write and uncompress existing compressed images and save it uncompressed, very similar to: binary.image.factor.compressor.

And as we already discussed previously, at some point in time we're going to introduce the next version of image header.
Again, I'm happy to thoroughly document and review the current on image header choices and its implications. I have been very careful about backward and forward compatibility.

nomennescio · 2024-05-13T21:12:12Z

And b.t.w, the old code path doesn't make sure it's initialized large enough, it just picks fixed sizes in which Factor images currently just happen to fit. And the exact same old code path is still in place for uncompressed images, it's 100% backwards compatible.

isn't h->data_size correct for uncompressed size? why wouldn't that buffer be big enough to uncompress into?

  p->tenured_size = std::max((h->data_size * 3) / 2, p->tenured_size);

  data_heap *d = new data_heap(&nursery,
                               p->young_size, p->aging_size, p->tenured_size);

This indeed overallocates the data heap.
However, not so for the code heap

  if (h->code_size > p->code_size)
    fatal_error("Code heap too small to fit image", h->code_size);

mrjbq7 · 2024-05-14T00:06:39Z

I merged it so you don't have to worry about tracking old branches.

I reserve the right to change it later, or move things around.

mrjbq7 · 2024-05-14T00:12:05Z

This indeed overallocates the data heap. However, not so for the code heap

I don't understand why the inflated size would ever be larger than h->code_size and h->data_size.

nomennescio · 2024-05-14T07:41:47Z

I merged it so you don't have to worry about tracking old branches.

I reserve the right to change it later, or move things around.

Thank you. I'm happy to support or discuss changes you like to see.

nomennescio added 30 commits August 23, 2023 19:29

zstd v1.5.2 single file

32ae135

Put zstd lib in zstd namespace. Factor compiles and runs.

6b499a7

Better dependencies

86b32e8

Put lib zstd into namespace lib::zstd

869c34a

zstd v1.5.5 single file

b12ae3f

Merge branch 'master' into feature-vm-compression

cd07843

Merge branch 'master' into feature-vm-compression

49e7c30

Merge branch 'master' into feature-vm-compression

166ef93

Merge branch 'master' into feature-vm-compression

7709158

Add retry with downloading boot image from master when not found on b…

9f2f1b8

…ranch Similar to what build.sh does.

Set previously failed

d452cfb

CMD misery

ae46dde

Annoying workaround

6ed2767

Triggering actor already shown by Github

8b90f23

Fix CMD issues

96e9596

Add zstd to Nmakefile

db43e81

Merge branch 'master' into feature-vm-compression

c4dcb90

Compile as .cpp before trying to compile as .c

cf8fea7

Make dependencies explicit

1678cb7

Fix

e77210b

Fix

3ccddd1

Fix

894d488

Typical MS braindead tool

Merge branch 'master' into feature-vm-compression

d734375

Merge branch 'master' into feature-vm-compression

88be802

Image format version 4 header support for compressed image

22bf0d6

For uncompressed images, compressed data- and code size are equal to data- and code size fields.

Basic support for loading compressed images

9b6e652

Still needs to copy compressed data into temporary buffer. No decompression error handling.

Order of rules didn't solve selection of zstd.c over zstd.cpp

b66c6e4

Compress Factor images suitable for the compressed image loader.

fca724f

Stack comments

70d3cd0

Supporting documentation

67adade

nomennescio self-assigned this May 11, 2024

nomennescio added 6 commits May 11, 2024 05:37

compressor: make ZSTD compression level variable

45286d9

compressor: refactor with compressed filename argument

33fd6b0

compressor: refactor

a83a013

compressor: rename and hide implementation detail

a3f8d9e

compressor: updated help documentation

2b1f50d

compressor: fix help typo

e639ac3

mrjbq7 merged commit 4b4b0e4 into master May 14, 2024
3 checks passed

nomennescio mentioned this pull request May 22, 2024

Add image compression as option to the deployment tool #2990

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature vm compression #2977

Feature vm compression #2977

nomennescio commented May 8, 2024

mrjbq7 commented May 10, 2024 via email

nomennescio commented May 11, 2024

mrjbq7 commented May 11, 2024 via email

nomennescio commented May 11, 2024 •

edited

nomennescio commented May 11, 2024

nomennescio commented May 12, 2024

nomennescio commented May 13, 2024

mrjbq7 commented May 13, 2024 •

edited

nomennescio commented May 13, 2024 •

edited

nomennescio commented May 13, 2024

mrjbq7 commented May 13, 2024

nomennescio commented May 13, 2024

mrjbq7 commented May 13, 2024

nomennescio commented May 13, 2024

mrjbq7 commented May 13, 2024

mrjbq7 commented May 13, 2024

nomennescio commented May 13, 2024 •

edited

nomennescio commented May 13, 2024 •

edited

mrjbq7 commented May 14, 2024 •

edited

mrjbq7 commented May 14, 2024

nomennescio commented May 14, 2024

Feature vm compression #2977

Feature vm compression #2977

Conversation

nomennescio commented May 8, 2024

mrjbq7 commented May 10, 2024 via email

nomennescio commented May 11, 2024

mrjbq7 commented May 11, 2024 via email

nomennescio commented May 11, 2024 • edited

nomennescio commented May 11, 2024

nomennescio commented May 12, 2024

nomennescio commented May 13, 2024

mrjbq7 commented May 13, 2024 • edited

nomennescio commented May 13, 2024 • edited

nomennescio commented May 13, 2024

mrjbq7 commented May 13, 2024

nomennescio commented May 13, 2024

mrjbq7 commented May 13, 2024

nomennescio commented May 13, 2024

mrjbq7 commented May 13, 2024

mrjbq7 commented May 13, 2024

nomennescio commented May 13, 2024 • edited

nomennescio commented May 13, 2024 • edited

mrjbq7 commented May 14, 2024 • edited

mrjbq7 commented May 14, 2024

nomennescio commented May 14, 2024

nomennescio commented May 11, 2024 •

edited

mrjbq7 commented May 13, 2024 •

edited

nomennescio commented May 13, 2024 •

edited

nomennescio commented May 13, 2024 •

edited

nomennescio commented May 13, 2024 •

edited

mrjbq7 commented May 14, 2024 •

edited