You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem:
Following the examples in the docs for submitting form data using multipart method results in encoding that is not compliant with RFC 3986 specifically related to how the tilde (~) character is handled. multipart("form", Map("some_key" -> "~")) results in a StringBody(some_key=%7E,utf-8,text/plain) form component generated where the tilde has been encoded into %7E. As per the latest spec on URL encoding, ~ is not a reserved character and should not be encoded.
It looks like this problem was partially tackled already within STTP as the internal UriCompatibility object has an additional encodeDNSHost method which uses a spec compliant Rfc3986.encode method. Unfortunately the multipart method uses the noncompliant encodeQuery method on UriCompatibility.
Is it possible to switch to using the Rfc3986 encoder instead of the URLEncoder as part of the multipart method's form handling? (edit: originally mentioned incorrect allowedCharacters set that would work here) Looks like a new character set would need to be defined within Rfc3986 object to get a fully correct allowedCharacters set. Want a PR for this?
The text was updated successfully, but these errors were encountered:
Disregard that edit note, the Rfc3986.Unreserved char set would be correct. I got confused because URLEncoder also incorrectly handles the * asterisk character by not encoding it even though it is a reserved subdelimiter character.
Yes, a PR would be great - though I'd like to include this change in sttp4 (so the PR would be to the master branch). If you could also add a test, that would be even better :)
Java URLEncoder has a following comment regarding encoding ~ .
mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
*
* It appears that both Netscape and Internet Explorer escape
* all special characters from this list with the exception
* of "-", "_", ".", "*". While it is not clear why they are
* escaping the other characters, perhaps it is safest to
* assume that there might be contexts in which the others
* are unsafe if not escaped. Therefore, we will use the same
* list. It is also noteworthy that this is consistent with
* O'Reilly's "HTML: The Definitive Guide" (page 164).
It seems that there may be some corner cases around parsing ~.
Although, they probably do not apply to multipart requests I wanted to share this to add some more context.
I think the we can safely use Rfc3986.encode for parts.
Problem:
Following the examples in the docs for submitting form data using
multipart
method results in encoding that is not compliant with RFC 3986 specifically related to how the tilde (~) character is handled.multipart("form", Map("some_key" -> "~"))
results in aStringBody(some_key=%7E,utf-8,text/plain)
form component generated where the tilde has been encoded into %7E. As per the latest spec on URL encoding, ~ is not a reserved character and should not be encoded.It looks like this problem was partially tackled already within STTP as the internal UriCompatibility object has an additional encodeDNSHost method which uses a spec compliant Rfc3986.encode method. Unfortunately the multipart method uses the noncompliant encodeQuery method on UriCompatibility.
Is it possible to switch to using the Rfc3986 encoder instead of the URLEncoder as part of the multipart method's form handling? (edit: originally mentioned incorrect allowedCharacters set that would work here) Looks like a new character set would need to be defined within Rfc3986 object to get a fully correct allowedCharacters set. Want a PR for this?
The text was updated successfully, but these errors were encountered: