-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix FrozenError
s
#168
base: master
Are you sure you want to change the base?
Fix FrozenError
s
#168
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -76,7 +76,7 @@ def decrypt | |
|
||
def set_general_key(password = '') | ||
# 1) make sure the initial key is 32 byte long (if no password, uses padding). | ||
key = (password.bytes[0..32].to_a + @padding_key)[0..31].to_a.pack('C*').force_encoding(Encoding::ASCII_8BIT) | ||
key = (password.bytes[0..32].to_a + @padding_key)[0..31].to_a.pack('C*').dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This dup is redundant and will use up a bunch of memory. The dynamic String returned from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. |
||
# 2) add the value of the encryption dictionary’s O entry | ||
key << actual_object(@encryption_dictionary[:O]).to_s | ||
# 3) Convert the integer value of the P entry to a 32-bit unsigned binary number | ||
|
@@ -89,7 +89,7 @@ def set_general_key(password = '') | |
# # if document metadata is not being encrypted, add 4 bytes with the value 0xFFFFFFFF. | ||
if actual_object(@encryption_dictionary[:R]) >= 4 | ||
if actual_object(@encryption_dictionary)[:EncryptMetadata] == false | ||
key << "\xFF\xFF\xFF\xFF".force_encoding(Encoding::ASCII_8BIT) | ||
key << "\xFF\xFF\xFF\xFF".dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the interpreter requires After all, the object's lifetime os very short (only created to be inserted into the existing String). |
||
end | ||
end | ||
# 5) pass everything as a MD5 hash | ||
|
@@ -137,7 +137,7 @@ def decrypt_AES(encrypted, encrypted_id, encrypted_generation, _encrypted_filter | |
object_key = @key.dup | ||
object_key << [encrypted_id].pack('i')[0..2] | ||
object_key << [encrypted_generation].pack('i')[0..1] | ||
object_key << 'sAlT'.force_encoding(Encoding::ASCII_8BIT) | ||
object_key << 'sAlT'.dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the interpreter requires After all, the object's lifetime os very short (only created to be inserted into the existing String). |
||
key_length = object_key.length < 16 ? object_key.length : 16 | ||
|
||
begin | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -41,7 +41,7 @@ class PDFParser | |
# string:: the data to be parsed, as a String object. | ||
def initialize(string, options = {}) | ||
raise TypeError, "couldn't parse data, expecting type String" unless string.is_a? String | ||
@string_to_parse = string.force_encoding(Encoding::ASCII_8BIT) | ||
@string_to_parse = string.dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm... why? Are we expecting frozen Strings? In general I understand that this might be a correct approach that will minimize unexpected side-effects. However, on the other hand, this will use a lot of memory when reading larger PDF files. |
||
@literal_strings = [].dup | ||
@hex_strings = [].dup | ||
@streams = [].dup | ||
|
@@ -243,22 +243,22 @@ def _parse_ | |
########################################## | ||
elsif str = @scanner.scan(/\<[0-9a-fA-F]*\>/) | ||
# warn "Found a hex string" | ||
str = str.slice(1..-2).force_encoding(Encoding::ASCII_8BIT) | ||
str = str.slice(1..-2).dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This dup is redundant and will use up a bunch of memory. The dynamic String returned from |
||
# str = "0#{str}" if str.length.odd? | ||
out << unify_string([str].pack('H*').force_encoding(Encoding::ASCII_8BIT)) | ||
out << unify_string([str].pack('H*').dup.force_encoding(Encoding::ASCII_8BIT)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (again) This dup is redundant and will use up a bunch of memory. The dynamic String returned from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. |
||
########################################## | ||
## parse a space delimited Hex String | ||
########################################## | ||
elsif str = @scanner.scan(/\<[0-9a-fA-F\s]*\>/) | ||
# warn "Found a space seperated hex string" | ||
str = str.force_encoding(Encoding::ASCII_8BIT).split(/\s/).map! {|b| b.length.odd? ? "0#{b}" : b} | ||
out << unify_string(str.pack('H*' * str.length).force_encoding(Encoding::ASCII_8BIT)) | ||
str = str.dup.force_encoding(Encoding::ASCII_8BIT).split(/\s/).map! {|b| b.length.odd? ? "0#{b}" : b} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why |
||
out << unify_string(str.pack('H*' * str.length).dup.force_encoding(Encoding::ASCII_8BIT)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This dup is redundant and will use up a bunch of memory. The dynamic String returned from |
||
########################################## | ||
## parse a Literal String | ||
########################################## | ||
elsif @scanner.scan(/\(/) | ||
# warn "Found a literal string" | ||
str = ''.force_encoding(Encoding::ASCII_8BIT) | ||
str = ''.dup.force_encoding(Encoding::ASCII_8BIT) | ||
count = 1 | ||
while count > 0 && @scanner.rest? | ||
scn = @scanner.scan_until(/[\(\)]/) | ||
|
@@ -285,7 +285,7 @@ def _parse_ | |
end | ||
# The PDF formatted string is: str[0..-2] | ||
# now starting to convert to regular string | ||
str_bytes = str.force_encoding(Encoding::ASCII_8BIT)[0..-2].bytes.to_a | ||
str_bytes = str.dup.force_encoding(Encoding::ASCII_8BIT)[0..-2].bytes.to_a | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why |
||
str = [] | ||
until str_bytes.empty? | ||
case str_bytes[0] | ||
|
@@ -335,7 +335,7 @@ def _parse_ | |
str << str_bytes.shift | ||
end | ||
end | ||
out << unify_string(str.pack('C*').force_encoding(Encoding::ASCII_8BIT)) | ||
out << unify_string(str.pack('C*').dup.force_encoding(Encoding::ASCII_8BIT)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (again) This dup is redundant and will use up a bunch of memory. The dynamic String returned from |
||
########################################## | ||
## parse a Dictionary | ||
########################################## | ||
|
@@ -367,10 +367,10 @@ def _parse_ | |
# need to remove end of stream | ||
if out.last.is_a? Hash | ||
# out.last[:raw_stream_content] = str[0...-10] #cuts only one EON char (\n or \r) | ||
out.last[:raw_stream_content] = unify_string str.sub(/(\r\n|\n|\r)?endstream\z/, '').force_encoding(Encoding::ASCII_8BIT) | ||
out.last[:raw_stream_content] = unify_string str.sub(/(\r\n|\n|\r)?endstream\z/, '').dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (again) This dup is redundant and will use up a bunch of memory. The dynamic String used here should be guaranteed to be mutable. |
||
else | ||
warn 'Stream not attached to dictionary!' | ||
out << str.sub(/(\r\n|\n|\r)?endstream\z/, '').force_encoding(Encoding::ASCII_8BIT) | ||
out << str.sub(/(\r\n|\n|\r)?endstream\z/, '').dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (again) This dup is redundant and will use up a bunch of memory. The dynamic String used here should be guaranteed to be mutable. |
||
end | ||
########################################## | ||
## parse an Object after finished | ||
|
@@ -686,7 +686,7 @@ def serialize_objects_and_references | |
|
||
# All Strings are one String | ||
def unify_string(str) | ||
str.force_encoding(Encoding::ASCII_8BIT) | ||
str.dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why |
||
@strings_dictionary[str] ||= str | ||
end | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -187,7 +187,7 @@ def to_pdf(options = {}) | |
xref = [] | ||
indirect_object_count = 1 # the first object is the null object | ||
# write head (version and binanry-code) | ||
out << "%PDF-#{@version}\n%\xFF\xFF\xFF\xFF\xFF\x00\x00\x00\x00".force_encoding(Encoding::ASCII_8BIT) | ||
out << "%PDF-#{@version}\n%\xFF\xFF\xFF\xFF\xFF\x00\x00\x00\x00".dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the interpreter requires After all, the object's lifetime os very short (only created to be inserted into the existing String). |
||
|
||
# collect objects and set xref table locations | ||
loc = 0 | ||
|
@@ -211,7 +211,7 @@ def to_pdf(options = {}) | |
# when finished, remove the numbering system and keep only pointers | ||
remove_old_ids | ||
# output the pdf stream | ||
out.join("\n".force_encoding(Encoding::ASCII_8BIT)).force_encoding(Encoding::ASCII_8BIT) | ||
out.join("\n".dup.force_encoding(Encoding::ASCII_8BIT)).dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again... maybe see his SO answer: |
||
end | ||
|
||
# this method returns all the pages cataloged in the catalog. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -47,14 +47,14 @@ def object_to_pdf(object) | |
|
||
def format_string_to_pdf(object) | ||
obj_bytes = object.bytes.to_a | ||
# object.force_encoding(Encoding::ASCII_8BIT) | ||
# object.dup.force_encoding(Encoding::ASCII_8BIT) | ||
if object.length == 0 || obj_bytes.min <= 31 || obj_bytes.max >= 127 # || (obj_bytes[0] != 68 object.match(/[^D\:\d\+\-Z\']/)) | ||
# A hexadecimal string shall be written as a sequence of hexadecimal digits (0-9 and either A-F or a-f) | ||
# encoded as ASCII characters and enclosed within angle brackets (using LESS-THAN SIGN (3Ch) and GREATER- THAN SIGN (3Eh)). | ||
"<#{object.unpack('H*')[0]}>".force_encoding(Encoding::ASCII_8BIT) | ||
"<#{object.unpack('H*')[0]}>".dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need this |
||
else | ||
# a good fit for a Literal String or the string is a date (MUST be literal) | ||
('(' + ([].tap { |out| obj_bytes.each { |byte| out.concat(STRING_REPLACEMENT_ARRAY[byte]) } } ).pack('C*') + ')').force_encoding(Encoding::ASCII_8BIT) | ||
('(' + ([].tap { |out| obj_bytes.each { |byte| out.concat(STRING_REPLACEMENT_ARRAY[byte]) } } ).pack('C*') + ')').dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (again) This dup is redundant and will use up a bunch of memory. The dynamic String used here should be guaranteed to be mutable. |
||
end | ||
end | ||
|
||
|
@@ -84,7 +84,7 @@ def format_name_to_pdf(object) | |
def format_array_to_pdf(object) | ||
# An array shall be written as a sequence of objects enclosed in SQUARE BRACKETS (using LEFT SQUARE BRACKET (5Bh) and RIGHT SQUARE BRACKET (5Dh)). | ||
# EXAMPLE [549 3.14 false (Ralph) /SomeName] | ||
('[' + (object.collect { |item| object_to_pdf(item) }).join(' ') + ']').force_encoding(Encoding::ASCII_8BIT) | ||
('[' + (object.collect { |item| object_to_pdf(item) }).join(' ') + ']').dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (again) This dup is redundant and will use up a bunch of memory. The dynamic String used here should be guaranteed to be mutable. |
||
end | ||
|
||
EMPTY_PAGE_CONTENT_STREAM = {is_reference_only: true, referenced_object: { indirect_reference_id: 0, raw_stream_content: '' }} | ||
|
@@ -100,19 +100,19 @@ def format_hash_to_pdf(object) | |
end | ||
object[:indirect_reference_id] ||= 0 | ||
object[:indirect_generation_number] ||= 0 | ||
return "#{object[:indirect_reference_id]} #{object[:indirect_generation_number]} R".force_encoding(Encoding::ASCII_8BIT) | ||
return "#{object[:indirect_reference_id]} #{object[:indirect_generation_number]} R".dup.force_encoding(Encoding::ASCII_8BIT) | ||
end | ||
|
||
# if the object is indirect... | ||
out = [] | ||
if object[:indirect_reference_id] | ||
object[:indirect_reference_id] ||= 0 | ||
object[:indirect_generation_number] ||= 0 | ||
out << "#{object[:indirect_reference_id]} #{object[:indirect_generation_number]} obj\n".force_encoding(Encoding::ASCII_8BIT) | ||
out << "#{object[:indirect_reference_id]} #{object[:indirect_generation_number]} obj\n".dup.force_encoding(Encoding::ASCII_8BIT) | ||
if object[:indirect_without_dictionary] | ||
out << object_to_pdf(object[:indirect_without_dictionary]) | ||
out << "\nendobj\n" | ||
return out.join.force_encoding(Encoding::ASCII_8BIT) | ||
return out.join.dup.force_encoding(Encoding::ASCII_8BIT) | ||
end | ||
end | ||
# remove extra page references. | ||
|
@@ -123,15 +123,15 @@ def format_hash_to_pdf(object) | |
# if the object is not a simple object, it is a dictionary | ||
# A dictionary shall be written as a sequence of key-value pairs enclosed in double angle brackets (<<...>>) | ||
# (using LESS-THAN SIGNs (3Ch) and GREATER-THAN SIGNs (3Eh)). | ||
out << "<<\n".force_encoding(Encoding::ASCII_8BIT) | ||
out << "<<\n".dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, what about a binary literal? SO: https://stackoverflow.com/a/15843685/4025095 |
||
object.each do |key, value| | ||
out << "#{object_to_pdf key} #{object_to_pdf value}\n".force_encoding(Encoding::ASCII_8BIT) unless PDF::PRIVATE_HASH_KEYS.include? key | ||
out << "#{object_to_pdf key} #{object_to_pdf value}\n".dup.force_encoding(Encoding::ASCII_8BIT) unless PDF::PRIVATE_HASH_KEYS.include? key | ||
end | ||
object.delete :Length | ||
out << '>>'.force_encoding(Encoding::ASCII_8BIT) | ||
out << "\nstream\n#{object[:raw_stream_content]}\nendstream".force_encoding(Encoding::ASCII_8BIT) if object[:raw_stream_content] | ||
out << '>>'.dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, what about a binary literal? SO: https://stackoverflow.com/a/15843685/4025095 |
||
out << "\nstream\n#{object[:raw_stream_content]}\nendstream".dup.force_encoding(Encoding::ASCII_8BIT) if object[:raw_stream_content] | ||
out << "\nendobj\n" if object[:indirect_reference_id] | ||
out.join.force_encoding(Encoding::ASCII_8BIT) | ||
out.join.dup.force_encoding(Encoding::ASCII_8BIT) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I searched for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps so... but perhaps not a bad idea. I would consider automatically replacing all |
||
end | ||
|
||
def actual_object(obj) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This
dup
is redundant and will use up a bunch of memory. The dynamic String returned fromIO.read
(should probably beIO.binread
) is mutable.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.