Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: StringT: re-support other string encodings #62

Merged
merged 1 commit into from Apr 8, 2024

Conversation

srl295
Copy link
Contributor

@srl295 srl295 commented Apr 6, 2024

History: #59 enabled support for 2-byte utf encodings, which were previously broken. So #59+#62 presents no regression for any other encodings.

@srl295
Copy link
Contributor Author

srl295 commented Apr 6, 2024

I'm still investigating what is happening here.

src/String.js Show resolved Hide resolved
src/String.js Outdated
@@ -95,9 +95,12 @@ function encodingWidth(encoding) {
switch(encoding) {
case 'ascii':
case 'utf8': // utf8 is a byte-based encoding for zero-term string
case 'x-mac-roman':
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fontkit uses a few others as well. Need fallback handling? Or is it an API usage issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this issue can cause an infinite loop, I would expect either:

  • a fallback
  • an explicit exception if the value is not handled and will possibly cause an infinite loop

Copy link
Contributor Author

@srl295 srl295 Apr 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's an infinite loop here. Perhaps the calling code or its callers have a loop?

There's an explicit exception on line 109

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@renchap for now, I have it assume 1-byte, which matches previous behavior

- add a test for 'x-mac-roman'
- add two utf-16 alias names
- if an encoding is otherwise unknown, assume 1-byte length
(this matches prior behavior)

Fixes: foliojs#60
@srl295 srl295 marked this pull request as ready for review April 6, 2024 22:49
@srl295 srl295 changed the title fix: work in progress on encodings fix: re-support other string encodings Apr 6, 2024
@srl295 srl295 changed the title fix: re-support other string encodings fix: StringT: re-support other string encodings Apr 6, 2024
@srl295 srl295 requested a review from renchap April 6, 2024 22:51
@srl295
Copy link
Contributor Author

srl295 commented Apr 6, 2024

FYI @blikblum @devongovett @mcdurdin

@blikblum
Copy link
Member

blikblum commented Apr 7, 2024

This seems to fix the fontkit issue. The ideal would be a reproduction for the react-pdf issue, but i think is good to go

Copy link
Contributor

@mcdurdin mcdurdin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this issue can cause an infinite loop, I would expect either:

  • a fallback
  • an explicit exception if the value is not handled and will possibly cause an infinite loop

I did a careful analysis of the while loop. I could not find any possibility of an infinite loop. It was, in my understanding, the exception in encodingLength() that was causing the hang, my guess because it was being handled in another layer and causing the stream reader to never emit end-of-stream -- which I think could be considered a separate bug in that layer. However I have not yet had time to analyse at that level. I will have a go when I am online again (travelling all this week with limited online availability).

The changes look good to me, thank you @srl295 for jumping on this while I was incommunicado. Apologies to all for the regression.

Comment on lines +108 to +110
//TODO: assume all other encodings are 1-byters
//throw new Error('Unknown encoding ' + encoding);
return 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if the consumer could be warned of the unknown encoding, but not sure that there is facility for that.

Note that the pattern of throwing on an unknown encoding is a pattern copied from existing code in this same module, specifically in StringT.byteLength():

restructure/src/String.js

Lines 150 to 151 in fcf7d64

default:
throw new Error('Unknown encoding ' + encoding);

So is there a latent issue in byteLength() that may bite future users? I would assume that it does not need to be addressed as urgently as this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ the throw may have only been on encode() or bytelength() before, which is a more restricted set of encodings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x-mac-roman is actually a known encoding, as well, it's handled by the underlying encoder. So one thing that could be done is to expand the set of encodings here to be exhaustive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK this PR brings back the previous behavior which is good enough. Lets do the minimal changes, unless there's a reproduction with an actual bug

src/String.js Show resolved Hide resolved
@blikblum blikblum merged commit 17e062c into foliojs:master Apr 8, 2024
3 checks passed
@srl295 srl295 deleted the encfix branch April 8, 2024 13:01
@srl295
Copy link
Contributor Author

srl295 commented Apr 8, 2024

@blikblum thanks. and I assume this will get a patch release also? By the way 3.0.1 is absent from https://github.com/foliojs/restructure/releases

@blikblum
Copy link
Member

blikblum commented Apr 8, 2024

I have write access to github but don't have npm publish permission. @devongovett can publish a new version

@IvanUkhov
Copy link

Thank you for fixing! Looking forward to a release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants