Skip to content

[PRO] Using S3 for storing and replaying traffic

Leonid Bugaev edited this page Jan 24, 2019 · 11 revisions

By default GoReplay allows you to store recorded requests in files.

On big traffic amounts, file storage becomes a bottleneck, and it make sense to upload this recording to the cloud storage.

While it is quite "easy" to upload data to the cloud, it becomes non-trivial if you want to replay this data using GoReplay.

GoReplay PRO add support for replayed data directly from Amazon S3 storage and uploading recorded data to S3 as well.

For reading from S3 you should use --input-file s3://<bucket>/<path>, example: gor --input-file s3://logs/2016-05 --output-http http://example.com.

For writing to S3 you should use --output-file s3://<bucket>/<path>, example: gor --input-raw :80 --output-file s3://logs/%Y-%m-%d.gz.

Both input and output will behave exactly the same, as it works with ordinary files, like file patterns (with some differences, see below) and automatically creating chunks.

Note: S3 file system by design does not support full file patterns, like supporting asterisk *, instead, it supports prefixes. So if you need to match following file pattern logs/2016-05-*, in S3 format you specify just prefix: s3://logs/2016-05-.

GoReplay takes AWS credentials from standard environment variables:

  • AWS_ACCESS_KEY_ID or AWS_ACCESS_KEY – AWS access key.
  • AWS_SECRET_ACCESS_KEY or AWS_SECRET_KEY – AWS secret key. Access and secret key variables override credentials stored in credential and config files.
  • AWS_REGION or AWS_DEFAULT_REGION – AWS region. This variable overrides the default region of the in-use profile if set.
  • AWS_ENDPOINT_URL- custom AWS S3 endpoint, if you use S3 over the proxy, AWS GovCloud, or for example S3 compatible storages like Minio
  • AWS_SESSION_TOKEN - If you have a temporary session token

As alternative GoReplay can read config from ~/.aws/configuration if you pass AWS_SDK_LOAD_CONFIG env variable.