Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/thumbor bucket in url #521

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

bennet-esyoil
Copy link

@bennet-esyoil bennet-esyoil commented Nov 30, 2023

Hello,

currently when using Thumbor-Style URLs (which is the only way to get readable URLs for SEO) you cannot specify the bucket which will get used. Currently always the first one provided via Env-Settings will be used.

My suggestion is to check if the URL contains any allowed bucket, and if it does, use that one. This would mean one can - but doesn't need to - add the bucket to the URL. If left empty it would not break existing setups.

e.g:
SOURCE_BUCKETS: "test1, test2"
Request: "/fit-in/my-image.jpg"
This would currently load "my-image.jpg" from the bucket "test1".

With my PR you can do "/fit-in/test2/my-image.jpg" which would load from test2.

Checklist

  • 👋 I have added unit tests for all code changes.
  • 👋 I have run the unit tests, and all unit tests have passed.
  • ⚠️ This pull request might incur a breaking change.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@dougtoppin
Copy link
Member

@bennet-esyoil thanks for your idea and submission, we will review it and let you know

@bennet-esyoil
Copy link
Author

@dougtoppin Thanks for your comment. Is there any chance to get any ETA since this is quite important for our current goal?

@dougtoppin
Copy link
Member

@bennet-esyoil We discussed your PR and it looks like an excellent addition. We will be it including in an upcoming release.
Thank you for your submission.

@@ -126,4 +126,30 @@ describe("parseImageBucket", () => {
});
}
});

it("should parse bucket-name from first part in thubor request but fail since it's not allowed", () => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thumbor is spelled incorrectly in the above string and in the next test.

Does this test cover when no bucket is provided? For instance, in my use case I do not want to specify a bucket name in the URL.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found some unrelated cases where the tests did not cover, thus my "akward" commits after creating the PR.

the only problem I currently see with my approach is that if you have a path that contains the bucket-name which you do not want to use, this would fail.

e.g. you have a bucket bucket-1 and want to request the file /bucket-1/test.jpg, from that bucket, the logic would remove the bucket-1 since it's a name. I was thinking about prefixing the parameter with something like bucket:bucket-1/bucket-1/test.jpg which would resolve that issue.

Does this test cover when no bucket is provided? For instance, in my use case I do not want to specify a bucket name in the URL.

If you dont, behaviour is the same as before (takes the first bucket provided in the env-variables)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks for the reply.

Yes, I agree with using a keyword like "bucket:"/ "s3:" similar to the "filters:" in the URL. That would work around the issue and hopefully not cause an issue with the path/bucket name

Example

    const event = { path: "/filters:grayscale()/bucket:test-bucket/test-image-001.jpg" };
    const event = { path: "/filters:grayscale()/s3:test-bucket/test-image-001.jpg" };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the s3:name looks pretty sleek, kinda ressembles the normal s3://bucket/object syntax. I'll get this changed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda don't like how it's not part of the ThumborMapper class, but it's kinda not part of thumbor itself and also needs to happen at an earlier stage, so it makes sense to have it where it is now..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @t4colingough any update?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello again @t4colingough & @dougtoppin ,

any chance to get this released any time soon?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bennet-esyoil,

This change has been made internally and is currently set to be included in the next minor/major release. While our internal processes had us merge the code manually, once the release with this change is out, you'll see yourself included in the external contributors section towards the bottom of the readme and this PR will be closed.

Thanks for your contributions,
Simon

expect(bucket).toEqual("allowedBucket001")
})

it("should parse bucket-name from first part in thumbor request and return it", () => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worthwhile adding a test to cover the newly fixed thumbor chaining issue: #343

It should have both the legacy and new chaining method

it("should parse bucket-name from first part in thumbor request and return it when using legacy multiple filters", () => {
    // Arrange
    const event = { path: "/filters:grayscale()/filters:rotate(180)/s3:test-bucket/test-image-001.jpg" };
    process.env.SOURCE_BUCKETS = "allowedBucket001, test-bucket";

    // Act
    const imageRequest = new ImageRequest(s3Client, secretProvider);

    const bucket = imageRequest.parseImageBucket(event, RequestTypes.THUMBOR);
    // Assert
    expect(bucket).toEqual("test-bucket")
  })
it("should parse bucket-name from first part in thumbor request and return it when chaining multiple filters", () => {
    // Arrange
    const event = { path: "/filters:grayscale():rotate(180)/s3:test-bucket/test-image-001.jpg" };
    process.env.SOURCE_BUCKETS = "allowedBucket001, test-bucket";

    // Act
    const imageRequest = new ImageRequest(s3Client, secretProvider);

    const bucket = imageRequest.parseImageBucket(event, RequestTypes.THUMBOR);
    // Assert
    expect(bucket).toEqual("test-bucket")
  })

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this should be covered by a totally different test-suite since it's not really part of that function

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thats fair enough

@t4colingough
Copy link

I'd like to point to this article and the security implications of having the S3 bucket name in the URL for everyone to see: https://medium.com/@maciej.pocwierz/how-an-empty-s3-bucket-can-make-your-aws-bill-explode-934a383cb8b1

@bennetgallein
Copy link

I'd like to point to this article and the security implications of having the S3 bucket name in the URL for everyone to see: https://medium.com/@maciej.pocwierz/how-an-empty-s3-bucket-can-make-your-aws-bill-explode-934a383cb8b1

Yes, this would apply for the base64 encoded version as well. Security through obfuscation 🙃 . It should be best practice to deploy an instance of the serverless-image-handler for each bucket without exception anyways. Having the bucket specified in the URL or in the B64 request data is just a quick way for us to switch between different buckets depending on some configuration.

@bennet-esyoil
Copy link
Author

Update:
https://aws.amazon.com/de/about-aws/whats-new/2024/05/amazon-s3-no-charge-http-error-codes/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants