Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: The HTTP Header of multipart/form-data no longer includes charset. #6251

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dongfangtianyu
Copy link

@dongfangtianyu dongfangtianyu commented Mar 17, 2024

Description

close #6250 .

Motivation and Context

On some web server implementations, including charset in the request header Content-Type of multipart/form-data can result in parsing errors of the boundary, leading to a failure in sending form content.

This is inconsistent with the behavior in JMeter 5.6.2 and also does not comply with RFC.

Perhaps fixing this in the httpclient would be a more elegant choice, but:

How Has This Been Tested?

  1. A unittest case
  2. runGUI then e2e test

Screenshots (if appropriate):

After : remove charset
image

Types of changes

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • My code follows the code style of this project.
  • I have updated the documentation accordingly.

@dongfangtianyu
Copy link
Author

dongfangtianyu commented Mar 18, 2024

It seems that the fachbook homepage doesn't return 200, but redirects to the login page.
I'm not sure if we need to modify the test case ResponseDecompression.jmx.

Debug info:

curl -v https://www.facebook.com
*   Trying 31.13.70.36:443...
* Connected to www.facebook.com (31.13.70.36) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS header, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS header, Finished (20):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.2 (OUT), TLS header, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_CHACHA20_POLY1305_SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=California; L=Menlo Park; O=Meta Platforms, Inc.; CN=*.facebook.com
*  start date: Dec 26 00:00:00 2023 GMT
*  expire date: Mar 25 23:59:59 2024 GMT
*  subjectAltName: host "www.facebook.com" matched cert's "*.facebook.com"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 High Assurance Server CA
*  SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* Using Stream ID: 1 (easy handle 0x563b8d507b60)
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
> GET / HTTP/2
> Host: www.facebook.com
> user-agent: curl/7.81.0
> accept: */*
>
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
< HTTP/2 302
< set-cookie: fr=0m99cpMhrD8XneYOy..Bl99ys..AAA.0.0.Bl99ys.AWVSv6D5HHg; expires=Sun, 16-Jun-2024 06:18:20 GMT; Max-Age=7776000; path=/; domain=.facebook.com; secure; httponly
< location: https://www.facebook.com/login/?next=https%3A%2F%2Fwww.facebook.com%2F

@vlsi
Copy link
Collaborator

vlsi commented Mar 18, 2024

I believe the core of the issue is that Apache Http Client adds the excessive encoding header: https://github.com/apache/httpcomponents-client/blob/54900db4653d7f207477e6ee40135b88e9bcf832/httpmime/src/main/java/org/apache/http/entity/mime/MultipartEntityBuilder.java#L215-L217

However, it does use the provided in multipartEntityBuilder.setCharset(charset); charset in https://github.com/apache/httpcomponents-client/blob/54900db4653d7f207477e6ee40135b88e9bcf832/httpmime/src/main/java/org/apache/http/entity/mime/MultipartEntityBuilder.java#L228 => https://github.com/apache/httpcomponents-client/blob/54900db4653d7f207477e6ee40135b88e9bcf832/httpmime/src/main/java/org/apache/http/entity/mime/HttpBrowserCompatibleMultipart.java#L69

How about fixing the issue in HTTP client instead?

As a temporary workaround, we could clone MultipartEntityBuilder into JMeter's codebase and remove the offending paramsList.add(new BasicNameValuePair("charset", charsetCopy.name()));, however, the removal of setCharset does not sound right to me as it would cause using ASCII via MIME.DEFAULT_CHARSET which is just wrong.

@dongfangtianyu
Copy link
Author

dongfangtianyu commented Mar 18, 2024

the core of the issue is that Apache Http Client

I completely agree;

How about fixing the issue in HTTP client instead?

Perhaps the issues in the Http Client should be fixed within the Http Client itself, and the changes are substantial.

he removal of setCharset does not sound right to me as it would cause using ASCII via MIME.DEFAULT_CHARSET which is just wrong.

I'm not sure where the mistake lies in doing so, could you provide an example? Thank you.


If I understand correctly, the actual effect of MultipartEntityBuilder.setCharset is only on the HTTP header and does not affect the processing of the HTTP body.

For multipart/form-data, its request content encoding is unrelated to the request headers and instead resides within the various parts of the HTTP Body.

image

These UTF-8 texts work fine.
image

So, according to RFC 7578, perhaps HttpClient shouldn't provide the setCharset method? Or perhaps JMeter shouldn't invoke setCharset?

It turns out, before HttpClient handles the Charset issue properly, JMeter just needs to refrain from calling the setCharset method to avoid adding charset=utf-8 to the request header (although HttpClient remains at the core of the issue). This doesn't affect the correctness of the HTTP body content (there might be other impacts not mentioned, feel free to point them out).

@vlsi
Copy link
Collaborator

vlsi commented Mar 18, 2024

Or perhaps JMeter shouldn't invoke setCharset?

As I said earlier, MultipartEntityBuilder.setCharset was needed there for HttpBrowserCompatibleMultipart.java to work, as there's no other option to configure charset for HttpBrowserCompatibleMultipart mode 🤷

@dongfangtianyu
Copy link
Author

dongfangtianyu commented Mar 18, 2024

Thank you, I understand your concern now.

If the setCharset method is not called, HttpBrowserCompatibleMultipart receives charsetCopy as null.

Furthermore, HttpBrowserCompatibleMultipart.charset == 'US-ASCII'.

Is this ?
image

However, from the results, it still works fine:

There is no charset='US-ASCII' in the HTTP header.
image

The content in the HTTP body is also UTF-8.
image

I don't know the reasons behind it, nor do I know if it's stable.

Thank you again for everything you have done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

According to RFC 2046 and RFC 7578, the request header of multipart/form-data should not include the charset.
2 participants