Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix folding non-ASCII header #1566

Closed
wants to merge 2 commits into from
Closed

Fix folding non-ASCII header #1566

wants to merge 2 commits into from

Conversation

ts1
Copy link

@ts1 ts1 commented Feb 17, 2023

Fixes #1565.

@sebbASF
Copy link
Collaborator

sebbASF commented Feb 17, 2023

It would be helpful to have at least one test please

@ts1
Copy link
Author

ts1 commented Feb 21, 2023

Before I write a test, 3 tests failed with my change.
I briefly checked that these encodings round-tripped correctly, as shown below.
If this is okay, I will fix result strings as my modified code outputs.

However, the test here states opposite of my problem.

it "should not fold a long string that has no spaces" do

May I remove 'not' here and show that it round-trips correctly?

diff --git a/spec/mail/encodings_spec.rb b/spec/mail/encodings_spec.rb
index 7428b938..79702670 100644
--- a/spec/mail/encodings_spec.rb
+++ b/spec/mail/encodings_spec.rb
@@ -293,8 +293,9 @@ RSpec.describe Mail::Encodings do
       end
       mail = Mail.new
       mail.subject = original
+      mail = Mail.new(mail.encoded)
       expect(mail[:subject].decoded).to eq original
-      expect(mail[:subject].encoded).to eq result
+      #expect(mail[:subject].encoded).to eq result
     end
 
     it "should round trip a complex string properly" do
@@ -306,13 +307,13 @@ RSpec.describe Mail::Encodings do
       mail = Mail.new
       mail.subject = original
       expect(mail[:subject].decoded).to eq original
-      expect(mail[:subject].encoded).to eq result
+      #expect(mail[:subject].encoded).to eq result
       mail = Mail.new(mail.encoded)
       expect(mail[:subject].decoded).to eq original
-      expect(mail[:subject].encoded).to eq result
+      #expect(mail[:subject].encoded).to eq result
       mail = Mail.new(mail.encoded)
       expect(mail[:subject].decoded).to eq original
-      expect(mail[:subject].encoded).to eq result
+      #expect(mail[:subject].encoded).to eq result
     end
 
     it "should round trip another complex string (koi-8)" do
diff --git a/spec/mail/fields/unstructured_field_spec.rb b/spec/mail/fields/unstructured_field_spec.rb
index af3407da..f1fb0293 100644
--- a/spec/mail/fields/unstructured_field_spec.rb
+++ b/spec/mail/fields/unstructured_field_spec.rb
@@ -158,8 +158,10 @@ RSpec.describe Mail::UnstructuredField do
       @field = Mail::UnstructuredField.new("X-SMTPAPI", string)
       string = string.dup.force_encoding('UTF-8')
       result = "X-SMTPAPI: =?UTF-8?Q?{=22unique=5Fargs=22:_{=22mailing=5Fid=22:147,=22a?=\r\n =?UTF-8?Q?ccount=5Fid=22:2},_=22to=22:_[=22larspind@gmail.com=22],_=22categ?=\r\n =?UTF-8?Q?ory=22:_=22mailing=22,_=22filters=22:_{=22domainkeys=22:_{=22sett?=\r\n =?UTF-8?Q?ings=22:_{=22domain=22:1,=22enable=22:1}}},_=22sub=22:_{=22{{op?=\r\n =?UTF-8?Q?en=5Fimage=5Furl}}=22:_[=22http://betaling.larspind.local/O?=\r\n =?UTF-8?Q?/token/147/Mailing::FakeRecipient=22],_=22{{name}}=22:_[=22[FIRST?=\r\n =?UTF-8?Q?_NAME]=22],_=22{{signup=5Freminder}}=22:_[=22=28her_kommer_til_at?=\r\n =?UTF-8?Q?_st=C3=A5_hvorn=C3=A5r_folk_har_skrevet_sig_op_...=29=22],?=\r\n =?UTF-8?Q?_=22{{unsubscribe=5Furl}}=22:_[=22http://betaling.larspind.?=\r\n =?UTF-8?Q?local/U/token/147/Mailing::FakeRecipient=22],_=22{{email}}=22:?=\r\n =?UTF-8?Q?_[=22larspind@gmail.com=22],_=22{{link:308}}=22:_[=22http://beta?=\r\n =?UTF-8?Q?ling.larspind.local/L/308/0/Mailing::FakeRecipient=22],_=22{{con?=\r\n =?UTF-8?Q?firm=5Furl}}=22:_[=22=22],_=22{{ref}}=22:_[=22[REF]=22]}}?=\r\n"
-      expect(@field.encoded).to eq result
-      expect(@field.decoded).to eq string
+      #expect(@field.encoded).to eq result
+
+      field = Mail::UnstructuredField.new("X-SMTPAPI", @field.encoded)
+      expect(field.decoded).to eq "X-SMTPAPI: #{string}\r\n"
     end
 
     it "should fold properly with continuous spaces around the linebreak" do

@sebbASF
Copy link
Collaborator

sebbASF commented Feb 21, 2023

May I remove 'not' here and show that it round-trips correctly?

No: folding must only occur between words separated by white space, so it is vital that there is a test to ensure that this rule is tested.

@ts1
Copy link
Author

ts1 commented Feb 21, 2023

So, what should we do when very long non-ASCII subject without whitespace is given?
Japanese text in usual doesn't contain whitespace.
If it is not folded, it results in garbled text.
I tested it on Rails to directly Gmail, and Rails to Mac Mail via AWS SES.
The subject is always garbled around 110th Japanese letter.
That is near 998th byte of UTF-8 QP.

With this patch the subject is flawlessly transferred.

@sebbASF
Copy link
Collaborator

sebbASF commented Feb 21, 2023

Folding is only defined for values that contain foldable white space.
If you add a fold at an arbitrary location, then when it is unfolded, there will be a spurious white space character.

I don't know what the solution is here, but changing the tests does not seem right.

@sebbASF
Copy link
Collaborator

sebbASF commented Feb 22, 2023

I've dug a little further, and folded white space is ignored between encoded words.

This is a way to support words that are too long: chop the word up into smaller chunks and encode all the chunks.
This works equally well for long ascii strings (provided you encode the chunks).

It does look as though something like your proposed fix might achieve that, but the failing tests need to be investigated further. It's possible that some tests need adjustment to allow for additional folding, or it may be that the proposed fix is incomplete.

Additionally, the code ought to work for long ascii words (which are not normally encoded).

I hope to take a further look soon.

@ts1
Copy link
Author

ts1 commented Feb 22, 2023

I rewrote it to fold ASCII-only long words safely, and abandoned 78-character limit but not to exceed 998 characters.
So the existing tests almost succeeds (except one).
Also I added 4 tests and 3 of them fails without my change.

@sebbASF
Copy link
Collaborator

sebbASF commented Feb 22, 2023

Sorry, but that is not acceptable.

Not all MTAs can handle 998 characters in an email line, so it is vital to keep the 78 char limit as per the RFC.

Ensuring that output lines are no more than 78 chars would automatically fix the original issue.

We also need a test to show the original issue. The expected result once the issue has been fixed will depend on how the long subject has been split into chunks, but the test can check that each line of the encoded output is no longer than 78 chars.

I am hopeful that the issue can be fixed with fewer changes.

@ts1
Copy link
Author

ts1 commented Feb 22, 2023

The original issue is shown as this test

it "should fold a Japanese subject with more than 998 characters long and no white space" do

It is okay to limit to 78 chars, but it will break more existing tests.

@sebbASF
Copy link
Collaborator

sebbASF commented Feb 22, 2023

I see the test now. I was looking for the subject from the original issue.

It is okay to limit to 78 chars, but it will break more existing tests.

In which case the tests need to be carefully inspected to see if they are correct, and adjusted if necessary. For example, the folds may be in a slightly different place than originally expected, but the result is still OK.

I am looking at this again today.

@sebbASF
Copy link
Collaborator

sebbASF commented Feb 22, 2023

See #1570 which fixes most folding issues.
There are a couple of instances where the output line is a bit longer than 78 chars

@ts1
Copy link
Author

ts1 commented Feb 23, 2023

#1570 looks good. I withdraw this.

@ts1 ts1 closed this Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problem folding non-ASCII header
2 participants