Add element to ssshist #267

esoha-nvidia · 2023-01-17T21:21:30Z

sssshist stores a histogram of how often each ssss value is used. There are 16 of them, from 1-16. Though 0 is unused, it's more convenient to index the elements by ssss so, rather than subtract one before indexing, we will make the array larger by one element.

This fixes #266

sssshist stores a histogram of how often each ssss value is used. There are 16 of them, from 1-16. Though 0 is unused, it's more convenient to index the elements by ssss so, rather than subtract one before indexing, we will make the array larger by one element. This fixes ilia3101#266

cgohlke · 2023-01-18T05:09:19Z

I think all references to sssshist (only three lines) can be safely be removed from the source code.

esoha-nvidia · 2023-01-18T22:39:09Z

It was to make a histogram, presumably for the purpose of analyzing statistics afterwards. Those statistics could help build a new Huffman tree for re-encoding data. But it's unused, yeah. As an alternative to deleting, it would also be possible to hide it behind a compile-time macro.

This would potentially make the code run faster. On the other hand, this code already looks quite inefficient so it probably doesn't matter too much! 😆

esoha-nvidia · 2023-01-18T22:41:42Z

MLV-App/src/mlv/liblj92/lj92.c

Line 227 in ef8a955

u16* hufflut = malloc((1<<maxbits) * sizeof(u16));

The lookup table is size 2¹⁸ bytes! This is ridiculous! The cache misses are probably going to wreck the performance!

masc4ii · 2023-01-20T18:29:42Z

Do you see any chance to optimized that code? I also identified this lj92 compression to be slow, while using just one CPU core... For full video processing from RAW to viewer, we loose many frames per second when using lossless decoding compared to uncompressed footage ... and the image processing (same for both) is very complex.

BTW: great, you guys work with this code! Thank you!

esoha-nvidia · 2023-01-20T20:59:55Z

@masc4ii It can be optimized but the code is written quite poorly so I'd be wary of even trying it. The variable names are obtuse and everything is in c and not c++. The code is just archaic so I'd be too afraid to fix it. It might be better to somewhat start over.

There is also the problem that my ideas for performance improvement are belonging to my employer so I cannot share them.

Are you able to run a performance test and tell me how many bytes-per-second output you have for lj92 as compared to, say, gunzip? I would expect that lj92 files should decompress about 4 times faster than gzip files. Certainly not slower because both are Huffman entropy encoding but gzip is much more complex than lj92.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add element to ssshist #267

Add element to ssshist #267

esoha-nvidia commented Jan 17, 2023

cgohlke commented Jan 18, 2023

esoha-nvidia commented Jan 18, 2023

esoha-nvidia commented Jan 18, 2023

masc4ii commented Jan 20, 2023 •

edited

esoha-nvidia commented Jan 20, 2023

Add element to ssshist #267

Are you sure you want to change the base?

Add element to ssshist #267

Conversation

esoha-nvidia commented Jan 17, 2023

cgohlke commented Jan 18, 2023

esoha-nvidia commented Jan 18, 2023

esoha-nvidia commented Jan 18, 2023

masc4ii commented Jan 20, 2023 • edited

esoha-nvidia commented Jan 20, 2023

masc4ii commented Jan 20, 2023 •

edited