Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use decodeUtf8' to check whether bytestrings are printable before logging #952

Merged

Conversation

korayal
Copy link
Contributor

@korayal korayal commented Jul 31, 2023

closes #947

on my local tests targeting localstack, below are the results when I hit the PutMetricData endpoint via a gzipped wrapper

[Client Request] {
  host      = localhost:4566
  secure    = False
  method    = POST
  target    = Nothing
  timeout   = ResponseTimeoutMicro 300000000
  redirects = 0
  path      = /
  query     =
  headers   = x-amz-content-sha256: e769f604d04a6c9f5f05218ea59e10ae8e728e64cff3dc4a2ca47b0b41c453e8; x-amz-date: 20230731T124639Z; host: localhost:4566; content-encoding: gzip; content-type: application/x-www-form-urlencoded; charset=utf-8; authorization: AWS4-HMAC-SHA256 Credential=localstack_key_id/20230731/us-east-1/monitoring/aws4_request, SignedHeaders=content-encoding;content-type;host;x-amz-content-sha256;x-amz-date, Signature=22efa655b91759f5deba281860a38ae9560065cfc2f16be198e51a4a95b93098
  body      = non-printable 365 strict bytes
}

or, when the data is printable (no gzipped payload):

[Client Request] {
  host      = localhost:4566
  secure    = False
  method    = POST
  target    = Nothing
  timeout   = ResponseTimeoutMicro 300000000
  redirects = 0
  path      = /
  query     =
  headers   = x-amz-content-sha256: efbf2bd7e9a1e0cdf2ee1b29225678f489d9d1d9baed20dde625d0faf56472fd; x-amz-date: 20230731T125748Z; host: localhost:4566; content-type: application/x-www-form-urlencoded; charset=utf-8; authorization: AWS4-HMAC-SHA256 Credential=localstack_key_id/20230731/us-east-1/monitoring/aws4_request, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=47dfedcd0d705427f27f78b8640026ee254a622bf6e973eb01d973af2b2caf81
  body      = Action=GetMetricStatistics&Dimensions.member.1.Name=test_id&Dimensions.member.1.Value=test10&EndTime=2023-07-31T12%3A57%3A48Z&MetricName=instrument-test&Namespace=test-namespace&Period=60&StartTime=2023-07-31T12%3A57%3A46Z&Statistics.member.1=SampleCount&Version=2010-08-01
}

@korayal korayal changed the title [#947] Use utfEncode' to log bytestrings [#947] Use utfEncode' to check whether bytestrings are printable before logging Jul 31, 2023
@endgame endgame changed the title [#947] Use utfEncode' to check whether bytestrings are printable before logging Use decodeUtf8' to check whether bytestrings are printable before logging Jul 31, 2023
Copy link
Collaborator

@endgame endgame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add an entry to lib/amazonka/CHANGELOG.md, which is the master changelog for the project?

Also, are we sure that this is the correct check to make? What if the user's terminal is set to some other encoding scheme? A TextEncoding (as returned by System.IO.hGetEncoding) isn't even in Eq, so I'm not sure what the best way is to fix this. I suspect what we have here might be a bit too naive; do you have any thoughts about other ways we could check this? It may also be the case that we successfully decode UTF-8 but it's got non-printable characters in it; do we want to do that check here too?

In the medium term, I probably want to extract logging from the main amazonka library so that people can use whatever logging framework suits them. Then handling non-printable stuff becomes the log library's concern.

lib/amazonka-core/src/Amazonka/Data/Log.hs Outdated Show resolved Hide resolved
@korayal
Copy link
Contributor Author

korayal commented Jul 31, 2023

Can you please add an entry to lib/amazonka/CHANGELOG.md, which is the master changelog for the project?

I've updated the changelog and added a section Unreleased at the top. would that work?

@korayal korayal force-pushed the do-not-log-nonprintable-bytestrings branch 5 times, most recently from 6354cd3 to 525f730 Compare July 31, 2023 15:54
@korayal korayal force-pushed the do-not-log-nonprintable-bytestrings branch from 525f730 to e6446f5 Compare July 31, 2023 15:56
@korayal
Copy link
Contributor Author

korayal commented Jul 31, 2023

Also, are we sure that this is the correct check to make? What if the user's terminal is set to some other encoding scheme? A TextEncoding (as returned by System.IO.hGetEncoding) isn't even in Eq, so I'm not sure what the best way is to fix this. I suspect what we have here might be a bit too naive; do you have any thoughts about other ways we could check this? It may also be the case that we successfully decode UTF-8 but it's got non-printable characters in it; do we want to do that check here too?

I've added another check with isPrint, where it should ensure that whatever we're trying to print contains printable characters. At this point I think assuming "the request body (if not compressed) is a valid utf-8 text" should be fine since it's what we're sending to AWS. The part that prints it out is still left to Build.lazyByteString / Build.byteString, so at worst case we're keeping the existing behavior.

In the medium term, I probably want to extract logging from the main amazonka library so that people can use whatever logging framework suits them. Then handling non-printable stuff becomes the log library's concern.

that'd be great.

lib/amazonka-core/src/Amazonka/Data/Log.hs Outdated Show resolved Hide resolved
lib/amazonka/CHANGELOG.md Outdated Show resolved Hide resolved
@korayal
Copy link
Contributor Author

korayal commented Aug 2, 2023

@endgame are we OK with the latest state of this PR?

@endgame endgame added this to the Amazonka 2.1 milestone Aug 7, 2023
Copy link
Collaborator

@endgame endgame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this, and sorry for the delay.

lib/amazonka-core/src/Amazonka/Data/Log.hs Outdated Show resolved Hide resolved
lib/amazonka-core/src/Amazonka/Data/Log.hs Outdated Show resolved Hide resolved
@endgame endgame merged commit 7353924 into brendanhay:main Aug 7, 2023
5 checks passed
@korayal korayal deleted the do-not-log-nonprintable-bytestrings branch August 8, 2023 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When the payload is a non-printable bytestring, printing the request terminates the app
2 participants