Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context 0.1: Huffman tables #454

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Conversation

Yoric
Copy link
Collaborator

@Yoric Yoric commented Sep 17, 2019

This is a first step towards decompressing Huffman content.

This table provides best performance but may only be used reasonably for small bit
lengths due to its memory cost. We'll implement a more space-efficient (but not
as fast) MultiLookupHuffmanTable in a followup patch.
@Yoric Yoric requested a review from xtuc September 17, 2019 08:40
@Yoric Yoric added this to In Progress in Port Context 0.1 to Rust Sep 17, 2019
@xtuc
Copy link
Member

xtuc commented Sep 17, 2019

I think it LGTM. I'm wondering if you couldn't use an existing implementation on crates.io? Looking at the results https://crates.io/search?q=huffman, they doesn't seem commonly used tho.

@Yoric
Copy link
Collaborator Author

Yoric commented Sep 17, 2019

I've looked at a few crates and they all seemed to use actual trees to represent Huffman codes, which makes them very slow for our use case.

@dominiccooney
Copy link
Member

Happy to review... could you look at the failing test first?

@Yoric
Copy link
Collaborator Author

Yoric commented Sep 25, 2019

I've reworked the MultiLookupHuffmanTable to remove some assumptions on the prefix bit length.

Copy link
Member

@dominiccooney dominiccooney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback.

crates/binjs_io/src/context/huffman/mod.rs Show resolved Hide resolved
///
/// If `bit_len` is larger than the number of bits, the prefix is padded with
/// lower-weight bits into `bit_len` bits.
pub fn split_bits(&self, bit_len: BitLen) -> (u32, u32) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure of the naming of split_bits versus split... Does this need both these APIs? Can the names be made more distinct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed it into split_raw_bits, is that better?

crates/binjs_io/src/context/huffman/mod.rs Show resolved Hide resolved
crates/binjs_io/src/context/huffman/mod.rs Outdated Show resolved Hide resolved
crates/binjs_io/src/context/huffman/mod.rs Outdated Show resolved Hide resolved
crates/binjs_io/src/context/huffman/mod.rs Outdated Show resolved Hide resolved
crates/binjs_io/src/context/huffman/mod.rs Outdated Show resolved Hide resolved
crates/binjs_io/src/context/huffman/mod.rs Show resolved Hide resolved
crates/binjs_io/src/context/huffman/mod.rs Outdated Show resolved Hide resolved
crates/binjs_io/src/context/huffman/mod.rs Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants