Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FrozenErrors #168

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bforma
Copy link

@bforma bforma commented Nov 26, 2019

.dup before .force_encoding

Calling .force_encoding on a frozen String raises a 'FrozenError:
can't modify frozen String'.

This can be mitigated by making a copy before calling
.force_encoding on that String.

Calling `.force_encoding` on a frozen String raises a 'FrozenError:
can't modify frozen String'.

This can be mitigated by calling making a copy before calling
`.force_encoding` on that String.
@jethrodaniel
Copy link

👍 We might as well add frozen_string_literal: false to all the *.rb files now, unless there are objections

@bforma
Copy link
Author

bforma commented Nov 26, 2019

If you mean # frozen_string_literal: true, then yes.

@boazsegev
Copy link
Owner

Hi @bforma and @jethrodaniel ,

Thank you for your input and for the PR.

I love the discussion and agree that using frozen_string_literal: false could work as an interim solution while code is being sanitized against frozen literals.

However, I'm super busy until December 25th and I'm not sure when I could test this approach and merge the changes.

I'll keep observing for now and hope for beautiful things :-)

Kindly,
Bo.

@@ -8,7 +8,7 @@ module CombinePDF
def load(file_name = '', options = {})
raise TypeError, "couldn't parse data, expecting type String" unless file_name.is_a?(String) || file_name.is_a?(Pathname)
return PDF.new if file_name == ''
PDF.new(PDFParser.new(IO.read(file_name, mode: 'rb').force_encoding(Encoding::ASCII_8BIT), options))
PDF.new(PDFParser.new(IO.read(file_name, mode: 'rb').dup.force_encoding(Encoding::ASCII_8BIT), options))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dup is redundant and will use up a bunch of memory. The dynamic String returned fromIO.read (should probably be IO.binread) is mutable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@@ -76,7 +76,7 @@ def decrypt

def set_general_key(password = '')
# 1) make sure the initial key is 32 byte long (if no password, uses padding).
key = (password.bytes[0..32].to_a + @padding_key)[0..31].to_a.pack('C*').force_encoding(Encoding::ASCII_8BIT)
key = (password.bytes[0..32].to_a + @padding_key)[0..31].to_a.pack('C*').dup.force_encoding(Encoding::ASCII_8BIT)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dup is redundant and will use up a bunch of memory. The dynamic String returned from Array#pack is mutable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@@ -41,7 +41,7 @@ class PDFParser
# string:: the data to be parsed, as a String object.
def initialize(string, options = {})
raise TypeError, "couldn't parse data, expecting type String" unless string.is_a? String
@string_to_parse = string.force_encoding(Encoding::ASCII_8BIT)
@string_to_parse = string.dup.force_encoding(Encoding::ASCII_8BIT)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... why? Are we expecting frozen Strings?

In general I understand that this might be a correct approach that will minimize unexpected side-effects. However, on the other hand, this will use a lot of memory when reading larger PDF files.

# str = "0#{str}" if str.length.odd?
out << unify_string([str].pack('H*').force_encoding(Encoding::ASCII_8BIT))
out << unify_string([str].pack('H*').dup.force_encoding(Encoding::ASCII_8BIT))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(again) This dup is redundant and will use up a bunch of memory. The dynamic String returned from Array#pack is mutable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@@ -243,22 +243,22 @@ def _parse_
##########################################
elsif str = @scanner.scan(/\<[0-9a-fA-F]*\>/)
# warn "Found a hex string"
str = str.slice(1..-2).force_encoding(Encoding::ASCII_8BIT)
str = str.slice(1..-2).dup.force_encoding(Encoding::ASCII_8BIT)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dup is redundant and will use up a bunch of memory. The dynamic String returned from String#slice should be mutable.

out << "\nendobj\n" if object[:indirect_reference_id]
out.join.force_encoding(Encoding::ASCII_8BIT)
out.join.dup.force_encoding(Encoding::ASCII_8BIT)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why dup the result?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I searched for .force_encoding and replaced it with .dup.force_encoding. Perhaps a too aggressive/naive approach.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps so... but perhaps not a bad idea.

I would consider automatically replacing all 'literal'.force_encoding(Encoding::ASCII_8BIT) with 'literal'.b ... an operation that could be automated by searching for '.force_encoding(Encoding::ASCII_8BIT) and replacing it with '.b

end
object.delete :Length
out << '>>'.force_encoding(Encoding::ASCII_8BIT)
out << "\nstream\n#{object[:raw_stream_content]}\nendstream".force_encoding(Encoding::ASCII_8BIT) if object[:raw_stream_content]
out << '>>'.dup.force_encoding(Encoding::ASCII_8BIT)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, what about a binary literal? SO: https://stackoverflow.com/a/15843685/4025095

@@ -123,15 +123,15 @@ def format_hash_to_pdf(object)
# if the object is not a simple object, it is a dictionary
# A dictionary shall be written as a sequence of key-value pairs enclosed in double angle brackets (<<...>>)
# (using LESS-THAN SIGNs (3Ch) and GREATER-THAN SIGNs (3Eh)).
out << "<<\n".force_encoding(Encoding::ASCII_8BIT)
out << "<<\n".dup.force_encoding(Encoding::ASCII_8BIT)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, what about a binary literal? SO: https://stackoverflow.com/a/15843685/4025095

@@ -89,7 +89,7 @@ def set_general_key(password = '')
# # if document metadata is not being encrypted, add 4 bytes with the value 0xFFFFFFFF.
if actual_object(@encryption_dictionary[:R]) >= 4
if actual_object(@encryption_dictionary)[:EncryptMetadata] == false
key << "\xFF\xFF\xFF\xFF".force_encoding(Encoding::ASCII_8BIT)
key << "\xFF\xFF\xFF\xFF".dup.force_encoding(Encoding::ASCII_8BIT)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the interpreter requires dup it's a little odd... Maybe there's a way to indicate that a String literal is ASCII without creating an additional object (see: https://stackoverflow.com/a/15843685/4025095 )?

After all, the object's lifetime os very short (only created to be inserted into the existing String).

@@ -137,7 +137,7 @@ def decrypt_AES(encrypted, encrypted_id, encrypted_generation, _encrypted_filter
object_key = @key.dup
object_key << [encrypted_id].pack('i')[0..2]
object_key << [encrypted_generation].pack('i')[0..1]
object_key << 'sAlT'.force_encoding(Encoding::ASCII_8BIT)
object_key << 'sAlT'.dup.force_encoding(Encoding::ASCII_8BIT)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the interpreter requires dup it's a little odd... Maybe there's a way to indicate that a String literal is ASCII without creating an additional object (see: https://stackoverflow.com/a/15843685/4025095 )?

After all, the object's lifetime os very short (only created to be inserted into the existing String).

@boazsegev
Copy link
Owner

@bforma ,

Could you state which Ruby version you're experiencing this issue with?

I don't get it on my machine.

Maybe you could author a quick Ruby example to reproduce the issue?

Thanks!
Bo.

@bforma
Copy link
Author

bforma commented Nov 28, 2019

We are running with Ruby 2.6.5.

All of our Ruby source files have the # frozen_string_literal: true magic comment.

Some of our tests failed due to a FrozenError: can't modify frozen String.

@bforma
Copy link
Author

bforma commented Nov 28, 2019

I just noticed that our failing tests define a String literal and (indirectly) pass that to CombinePDF.parse. Just doing a .dup on those Strings in the tests already fix the issue.

Perhaps this PR is not needed?

@boazsegev
Copy link
Owner

I just noticed that our failing tests define a String literal and (indirectly) pass that to CombinePDF.parse. Just doing a .dup on those Strings in the tests already fix the issue.

Perhaps this PR is not needed?

Perhaps it isn't needed in the short term, but it does shine a light on an issue with literals in the source code.

The source code could benefit from caching all the literals in binary form and then using them as needed. This could minimize object creation and improve performance for long running applications.

casperisfine pushed a commit to casperisfine/combine_pdf that referenced this pull request Mar 28, 2024
Fix: boazsegev#168
Ref: https://bugs.ruby-lang.org/issues/20205

Ruby 3.4 will emit deprecation warnings when mutating string
literals, and a future version is likely to make frozen string
literals the default.
casperisfine pushed a commit to casperisfine/combine_pdf that referenced this pull request Mar 28, 2024
Fix: boazsegev#168
Ref: https://bugs.ruby-lang.org/issues/20205

Ruby 3.4 will emit deprecation warnings when mutating string
literals, and a future version is likely to make frozen string
literals the default.
casperisfine pushed a commit to casperisfine/combine_pdf that referenced this pull request Mar 28, 2024
Fix: boazsegev#168
Ref: https://bugs.ruby-lang.org/issues/20205

Ruby 3.4 will emit deprecation warnings when mutating string
literals, and a future version is likely to make frozen string
literals the default.
casperisfine pushed a commit to casperisfine/combine_pdf that referenced this pull request Mar 28, 2024
Fix: boazsegev#168
Ref: https://bugs.ruby-lang.org/issues/20205

Ruby 3.4 will emit deprecation warnings when mutating string
literals, and a future version is likely to make frozen string
literals the default.
casperisfine pushed a commit to casperisfine/combine_pdf that referenced this pull request Mar 28, 2024
Fix: boazsegev#168
Ref: https://bugs.ruby-lang.org/issues/20205

Ruby 3.4 will emit deprecation warnings when mutating string
literals, and a future version is likely to make frozen string
literals the default.
casperisfine pushed a commit to casperisfine/combine_pdf that referenced this pull request Mar 28, 2024
Fix: boazsegev#168
Ref: https://bugs.ruby-lang.org/issues/20205

Ruby 3.4 will emit deprecation warnings when mutating string
literals, and a future version is likely to make frozen string
literals the default.
casperisfine pushed a commit to casperisfine/combine_pdf that referenced this pull request Mar 28, 2024
Fix: boazsegev#168
Ref: https://bugs.ruby-lang.org/issues/20205

Ruby 3.4 will emit deprecation warnings when mutating string
literals, and a future version is likely to make frozen string
literals the default.
casperisfine pushed a commit to casperisfine/combine_pdf that referenced this pull request Mar 28, 2024
Fix: boazsegev#168
Ref: https://bugs.ruby-lang.org/issues/20205

Ruby 3.4 will emit deprecation warnings when mutating string
literals, and a future version is likely to make frozen string
literals the default.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants