Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failure: assert next_code_length >= code_length when code_length overflows #440

Open
arai-a opened this issue Sep 10, 2019 · 6 comments

Comments

@arai-a
Copy link
Collaborator

arai-a commented Sep 10, 2019

Steps to reproduce:

  1. run the following Python script and redirect the output to test.js
    for i in range(1, 23):
        for j in range(0, int(1.7**i)):
            print '{};'.format(i)

(the result is 1.1MB of JS file)

  1. encode the JS file generated in step 1 with fbssdc

Actual result:

crash at

assert next_code_length >= code_length, f'{k}, {next_code_length}, {code_length}'

AssertionError: (2097151, 21), 20, 21
@Yoric
Copy link
Collaborator

Yoric commented Sep 10, 2019

Cc @dominiccooney

So, we have too many numbers in the table of numbers?

@arai-a
Copy link
Collaborator Author

arai-a commented Sep 10, 2019

it tries to assign 21-bit length code for 1 and 2 symbols.

we'll need to define the fallback algorithm for this kind of case.
(IMO, this doesn't happen in usual wild case, so just falling back to assign same length code for all items should be fine)

@dominiccooney
Copy link
Member

dominiccooney commented Sep 12, 2019 via email

@Yoric
Copy link
Collaborator

Yoric commented Sep 13, 2019

That algorithm is a non-normative illustration of assigning codes, don't
take it as spec. We should add a note about the code length issue.

Ah, good to know. We're using it as spec all over SpiderMonkey for the time being :)

@dominiccooney
Copy link
Member

dominiccooney commented Sep 13, 2019 via email

@Yoric
Copy link
Collaborator

Yoric commented Sep 13, 2019

Oh, got it, 20 is part of the spec, but the algorithm isn't.

So, given that there are real-world files where 20 isn't sufficient for an optimal code, do we wish to keep 20 as a hard limit? Or do we expect to become suboptimal for huge files (and possibly issue a warning telling the developers that their file is simply too large)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants