Skip to content

Commit

Permalink
Feature vm compression (#2977)
Browse files Browse the repository at this point in the history
* zstd v1.5.2 single file

* Put zstd lib in zstd namespace. Factor compiles and runs.

* Better dependencies

* Put lib zstd into namespace lib::zstd

* zstd v1.5.5 single file

* Add retry with downloading boot image from master when not found on branch

Similar to what build.sh does.

* Set previously failed

* CMD misery

* Annoying workaround

* Triggering actor already shown by Github

* Fix CMD issues

* Add zstd to Nmakefile

* Compile as .cpp before trying to compile as .c

* Make dependencies explicit

* Fix

* Fix

* Fix

Typical MS braindead tool

* Image format version 4 header support for compressed image

For uncompressed images, compressed data- and code size are equal to data- and code size fields.

* Basic support for loading compressed images

Still needs to copy compressed data into temporary buffer.
No decompression error handling.

* Order of rules didn't solve selection of zstd.c over zstd.cpp

* Compress Factor images suitable for the compressed image loader.

* Stack comments

* Supporting documentation

* Reformat

* Ignore variants of image files

* Reformat

* Documentation for binary.image.factor.compressor

* Feature vm compression support (#2589)

* zstd v1.5.2 single file

* Put zstd lib in zstd namespace. Factor compiles and runs.

* Better dependencies

* Put lib zstd into namespace lib::zstd

* Header v4 with compression support

* Load compressed data and code images. No uncompression yet.

* Uncompress here

* Refactor

* As we're statically linking, enable statically linked API

* Decompress into separate buffer

* ui.gadgets.gadgets: prevent busy loop with ``f focusable-child``

* Revert "ui.gadgets.gadgets: prevent busy loop with ``f focusable-child``"

This reverts commit c81cf49.

* ui.gadgets: change focusable-child* to not return f

* Remove executable bit

* reservoir-sampling: cleanup example in docs

* Config.macosx: leopard is 10.5, xcode 5 was released in 2013...remove

* GNUmakefile: prefer clang if it exists

* words: add unintern-word definition

* python.syntax: Link help article

* ui.gadgets: simplify focusable-child

* zeromq: better zeromq-error

* gpu.shaders: better errors

* words: better error class for undefined-word

* mason.docs: build without xattrs

* mason.release.archive: build tar without xattrs

* webapps.mason.docs-update: use simpler words

* codebase-analyzer: handle owners and version files

* deques: allow the generics to inline in the simple words

* Align with main repo

* Align with main repo

* zstd v1.5.5 single file

* Resolve conflict

* Resolve conflict

#define ZSTD_STATIC_LINKING_ONLY

---------

Co-authored-by: John Benediktsson <mrjbq7@gmail.com>
Co-authored-by: Doug Coleman <doug.coleman@gmail.com>
Co-authored-by: Capital <CapitalEx@protonmail.com>
Co-authored-by: Giftpflanze <gifti@tools.wmflabs.org>

* Add documentation on classes

* Sync and refactor

* Enable zstd advanced experimental functions

* Add error handling

* zstd v1.5.6

generated with
cd zstd/build/single_file_libs/
python combine.py -r ../../lib -o zstd.c zstd-in.c

* C linkage reduces executable size

* Link Time Optimization and stripping in non-debug code leads to a 66% reduction in executable size

* Fixes unused variable warning

* Improve LTO compilation speed on multicore machines

* Uncompress in existing heaps if possible

* Put it back in

* Fix dependency resolution

* Updated copyrights

* Supported option by both gcc and clang

* Fix paths

* Fix for clang++ needing flag to disable warning

* Better alternative exists until newer clang++ version is used

This reverts commit 7a97772.

* Ignore clang++ specific warning

* Guard compiler specific pragmas against MSVC

* compressor: make ZSTD compression level variable

* compressor: refactor with compressed filename argument

* compressor: refactor

* compressor: rename and hide implementation detail

* compressor: updated help documentation

* compressor: fix help typo

---------

Co-authored-by: John Benediktsson <mrjbq7@gmail.com>
Co-authored-by: Doug Coleman <doug.coleman@gmail.com>
Co-authored-by: Capital <CapitalEx@protonmail.com>
Co-authored-by: Giftpflanze <gifti@tools.wmflabs.org>
  • Loading branch information
5 people committed May 14, 2024
1 parent 4b1204b commit 4b4b0e4
Show file tree
Hide file tree
Showing 16 changed files with 55,744 additions and 17 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
*.exp
*.gch*
*.image
factor.image.*
boot.*.image.*
*.lib
*.o
*.obj
Expand Down
9 changes: 7 additions & 2 deletions GNUmakefile
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,8 @@ ifdef CONFIG
$(BUILD_DIR)/tuples.o \
$(BUILD_DIR)/utilities.o \
$(BUILD_DIR)/vm.o \
$(BUILD_DIR)/words.o
$(BUILD_DIR)/words.o \
$(BUILD_DIR)/zstd.o

MASTER_HEADERS = $(PLAF_MASTER_HEADERS) \
vm/assert.hpp \
Expand Down Expand Up @@ -154,7 +155,8 @@ ifdef CONFIG
vm/inline_cache.hpp \
vm/mvm.hpp \
vm/factor.hpp \
vm/utilities.hpp
vm/utilities.hpp \
vm/zstd.hpp vm/zstd.h

EXE_OBJS = $(PLAF_EXE_OBJS)

Expand Down Expand Up @@ -280,6 +282,9 @@ $(BUILD_DIR)/ffi_test.o: vm/ffi_test.c | $(BUILD_DIR)
$(BUILD_DIR)/master.hpp.gch: vm/master.hpp $(MASTER_HEADERS) | $(BUILD_DIR)
$(TOOLCHAIN_PREFIX)$(CXX) -c -x c++-header $(CFLAGS) $(CXXFLAGS) -o $@ $<

$(BUILD_DIR)/zstd.o: vm/zstd.cpp vm/zstd.c $(BUILD_DIR)/master.hpp.gch | $(BUILD_DIR)
$(TOOLCHAIN_PREFIX)$(CXX) -c $(CFLAGS) $(CXXFLAGS) -o $@ $<

$(BUILD_DIR)/%.o: vm/%.cpp $(BUILD_DIR)/master.hpp.gch | $(BUILD_DIR)
$(TOOLCHAIN_PREFIX)$(CXX) -c $(CFLAGS) $(CXXFLAGS) -o $@ $<

Expand Down
6 changes: 5 additions & 1 deletion Nmakefile
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,11 @@ DLL_OBJS = $(PLAF_DLL_OBJS) \
vm\tuples.obj \
vm\utilities.obj \
vm\vm.obj \
vm\words.obj
vm\words.obj \
vm\zstd.obj

vm\zstd.obj: vm\zstd.cpp vm\zstd.hpp vm\zstd.c vm\zstd.h vm\master.hpp
cl /EHsc $(CL_FLAGS) /MP /Fovm/ /c vm\zstd.cpp

# batch mode has ::
.cpp.obj::
Expand Down
1 change: 1 addition & 0 deletions basis/binary/image/factor/compressor/authors.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
nomennescio
96 changes: 96 additions & 0 deletions basis/binary/image/factor/compressor/compressor-docs.factor
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
! Copyright (C) 2022-2024 nomennescio
! See https://factorcode.org/license.txt for BSD license.
USING: byte-arrays help.markup help.syntax strings ;
FROM: tools.image-analyzer.vm => image-header ;
IN: binary.image.factor.compressor

ARTICLE: "binary.image.factor.compressor" "Compress Factor image file for loading by the VM"
"The " { $vocab-link "binary.image.factor.compressor" } " vocabulary compresses Factor images such that the VM can load it and decompress it on the fly. Compressed and uncompressed Factor images are both supported by the VM and are only determined by their image headers." $nl
"You can also run the compressor on the current Factor image directly from the commandline:" { $code "factor -run=binary.image.factor.compressor" } ;

HELP: image
{ $class-description "In-memory Factor image" } ;

HELP: image-header
{ $class-description "Factor image header structure" } ;

HELP: >compression-header
{ $values
{ "headerv4" image-header }
{ "headerv4+" image-header }
}
{ $description "Converts any header into a compression supporting header" }
;

HELP: compression-level
{ $var-description "Compression parameter : 1 (least) .. 22 (most). Default value 12." } ;

HELP: compress
{ $values
{ "byte-array" byte-array }
{ "compressed" byte-array }
}
{ $description "Compresses bytes" }
;

HELP: compress-code
{ $values
{ "image" image }
{ "image'" image }
}
{ $description "Compresses code heap" }
;

HELP: compress-data
{ $values
{ "image" image }
{ "image'" image }
}
{ $description "Compresses data heap" }
;

HELP: compress-image
{ $values
{ "image" image }
{ "image'" image }
}
{ $description "Compresses data- and code heaps and syncs header" }
;

HELP: load-factor-image
{ $values
{ "filename" string }
{ "image" image }
}
{ $description "Load Factor image into memory" }
;

HELP: save-factor-image
{ $values
{ "image" image }
{ "filename" string }
}
{ $description "Save Factor image from memory" }
;

HELP: compress-factor-image
{ $values
{ "image-file" string }
{ "compressed-file" string }
}
{ $description "Load, compresses and saves a Factor image" }
;

HELP: sync-header
{ $values
{ "image" image }
{ "image'" image }
}
{ $description "Sync header from actual data and code sizes" }
;

HELP: compress-current-image
{ $description "Load, compresses and saves current Factor image with \".compressed\" appended to its filename" }
;

ABOUT: "binary.image.factor.compressor"
59 changes: 59 additions & 0 deletions basis/binary/image/factor/compressor/compressor.factor
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
! Copyright (C) 2022-2024 nomennescio
! See https://factorcode.org/license.txt for BSD license.
! can be run as : factor -run=binary.image.factor.compressor

USING: accessors byte-arrays classes.struct compression.zstd io
io.encodings.binary io.files kernel kernel.private locals math
namespaces sequences system tools.image-analyzer
tools.image-analyzer.vm vm ;
IN: binary.image.factor.compressor

TUPLE: image
{ header image-header }
{ data byte-array }
{ code byte-array } ;

! converts to compression compatible header if needed
: >compression-header ( headerv4 -- headerv4+ )
dup data-size>> zero?
[ dup data-size>> [ >>escaped-data-size ] [ >>compressed-data-size ] 2bi
code-size>> >>compressed-code-size 0 >>data-size
] unless
;

: sync-header ( image -- image' )
dup data>> length over header>> compressed-data-size<<
dup code>> length over header>> compressed-code-size<<
;

! load factor image
: load-factor-image ( filename -- image )
binary [
image-header read-struct >compression-header dup
[ compressed-data-size>> read ]
[ compressed-code-size>> read ] bi
] with-file-reader image boa
;

! save factor image
: save-factor-image ( image filename -- )
binary [
[ header>> ] [ data>> ] [ code>> ] tri [ write ] tri@
] with-file-writer
;

SYMBOL: compression-level
12 compression-level set-global ! level 12 seems the right balance between compression factor and compression speed

: compress ( byte-array -- compressed ) compression-level get zstd-compress-level ;
: compress-data ( image -- image' ) dup header>> [ escaped-data-size>> ] [ compressed-data-size>> ] bi = [ dup data>> compress >>data ] when ; ! only compress uncompressed data
: compress-code ( image -- image' ) dup header>> [ code-size>> ] [ compressed-code-size>> ] bi = [ dup code>> compress >>code ] when ; ! only compress uncompressed code
: compress-image ( image -- image' ) compress-data compress-code sync-header ;

! compress factor image
: compress-factor-image ( image-file compressed-file -- )
[ load-factor-image compress-image ] dip save-factor-image
;

: compress-current-image ( -- ) image-path dup ".compressed" append compress-factor-image ;
MAIN: compress-current-image
1 change: 1 addition & 0 deletions basis/binary/image/factor/compressor/summary.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
compress Factor image file for loading by the VM
4 changes: 4 additions & 0 deletions basis/binary/image/factor/compressor/tags.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
tools
compression
image
vm
4 changes: 2 additions & 2 deletions extra/tools/image-analyzer/vm/vm.factor
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ STRUCT: image-header
{ code-relocation-base cell_t }
{ code-size cell_t }
{ escaped-data-size cell_t }
{ reserved-2 cell_t }
{ reserved-3 cell_t }
{ compressed-data-size cell_t initial: 0 }
{ compressed-code-size cell_t initial: 0 }
{ reserved-4 cell_t }
{ special-objects cell_t[special-object-count] } ;

Expand Down
54 changes: 46 additions & 8 deletions vm/image.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -103,15 +103,32 @@ void factor_vm::load_data_heap(FILE* file, image_header* h, vm_parameters* p) {
data_heap *d = new data_heap(&nursery,
p->young_size, p->aging_size, p->tenured_size);
set_data_heap(d);

auto uncompress = h->data_size != h->compressed_data_size;
auto uncompressed_data_size = uncompress ? align_page (h->data_size) : 0;
auto temp = uncompress && uncompressed_data_size+h->compressed_data_size > data->tenured->size;
auto buf = temp ? malloc (h->compressed_data_size) : (char*)data->tenured->start+uncompressed_data_size;
if (!buf) fatal_error ("Out of memory in load_data_heap", 0);

fixnum bytes_read =
raw_fread((void*)data->tenured->start, 1, h->data_size, file);
raw_fread(buf, 1, h->compressed_data_size, file);

if ((cell)bytes_read != h->data_size) {
if ((cell)bytes_read != h->compressed_data_size) {
std::cout << "truncated image: " << bytes_read << " bytes read, ";
std::cout << h->data_size << " bytes expected\n";
fatal_error("load_data_heap failed", 0);
std::cout << h->compressed_data_size << " bytes expected\n";
fatal_error ("load_data_heap failed", 0);
}

if (uncompress) {
size_t result = lib::zstd::ZSTD_decompress ((void*)data->tenured->start, h->data_size, buf, h->compressed_data_size);
if (lib::zstd::ZSTD_isError (result)) {
std::cout << "data heap decompression: " << lib::zstd::ZSTD_getErrorName (result) << '\n';
fatal_error ("load_data_heap failed", 0);
}
}

if (temp) free (buf);

data->tenured->initial_free_list(h->data_size);
}

Expand All @@ -122,13 +139,29 @@ void factor_vm::load_code_heap(FILE* file, image_header* h, vm_parameters* p) {
code = new code_heap(p->code_size);

if (h->code_size != 0) {
auto uncompress = h->code_size != h->compressed_code_size;
auto uncompressed_code_size = uncompress ? align_page (h->code_size) : 0;
auto temp = uncompress && uncompressed_code_size+h->compressed_code_size > code->allocator->size;
auto buf = temp ? malloc (h->compressed_code_size) : (char*)code->allocator->start+uncompressed_code_size;
if (!buf) fatal_error ("Out of memory in load_code_heap", 0);

size_t bytes_read =
raw_fread((void*)code->allocator->start, 1, h->code_size, file);
if (bytes_read != h->code_size) {
raw_fread(buf, 1, h->compressed_code_size, file);
if (bytes_read != h->compressed_code_size) {
std::cout << "truncated image: " << bytes_read << " bytes read, ";
std::cout << h->code_size << " bytes expected\n";
std::cout << h->compressed_code_size << " bytes expected\n";
fatal_error("load_code_heap failed", 0);
}

if (uncompress) {
size_t result = lib::zstd::ZSTD_decompress ((void*)code->allocator->start, h->code_size, buf, h->compressed_code_size);
if (lib::zstd::ZSTD_isError (result)) {
std::cout << "code heap decompression: " << lib::zstd::ZSTD_getErrorName (result) << '\n';
fatal_error ("load_code_heap failed", 0);
}
}

if (temp) free (buf);
}

code->allocator->initial_free_list(h->code_size);
Expand Down Expand Up @@ -255,7 +288,12 @@ void factor_vm::load_image(vm_parameters* p) {
if (h.version != image_version)
fatal_error("Bad image: version number check failed", h.version);

if (!h.version4_escape) h.data_size=h.escaped_data_size, h.escaped_data_size=0;
if (!h.version4_escape) {
h.data_size = h.escaped_data_size;
} else {
h.compressed_data_size = h.data_size;
h.compressed_code_size = h.code_size;
}

load_data_heap(file, &h, p);
load_code_heap(file, &h, p);
Expand Down
6 changes: 2 additions & 4 deletions vm/image.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,10 @@ struct image_header {
cell code_relocation_base;
// size of code heap
cell code_size;

union { cell reserved_1; cell escaped_data_size; }; // undefined if data_size <>0, stores size of data heap otherwise
cell reserved_2; // undefined if data_size <>0, 0 otherwise
cell reserved_3; // undefined if data_size <>0, 0 otherwise
union { cell reserved_2; cell compressed_data_size; }; // undefined if data_size <>0, compressed data heap size if smaller than data heap size
union { cell reserved_3; cell compressed_code_size; }; // undefined if data_size <>0, compressed code heap size if smaller than code heap size
cell reserved_4; // undefined if data_size <>0, 0 otherwise

// Initial user environment
cell special_objects[special_object_count];
};
Expand Down
1 change: 1 addition & 0 deletions vm/master.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@
namespace factor { struct factor_vm; }

// Factor headers
#include "zstd.hpp"
#include "assert.hpp"
#include "debug.hpp"
#include "layouts.hpp"
Expand Down

0 comments on commit 4b4b0e4

Please sign in to comment.