Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fetching packages from S3 buckets #215

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

lyoung-confluent
Copy link

@lyoung-confluent lyoung-confluent commented Mar 1, 2024

Adds experimental support for fetching packages directly from S3 bucket via a s3://<bucket>/<optional prefix>/ repository URL. It will load any credentials from the default AWS credentials chain. While this uses the AWS S3 SDK, it can be pointed at any S3 compatible API (ex: GCS, Cloudflare R2, etc) via the endpoint url configuration feature.

@tuananh
Copy link

tuananh commented Mar 19, 2024

This is cool :) can we get this reviewed ?

Copy link
Member

@imjasonh imjasonh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: is there a specific use case you're interested in supporting with this, that this is blocking? This could be useful, but if we're not blocked on it I might like to consider it for a little longer.

(Sorry for not seeing this PR until now 😭 )

@@ -375,7 +375,7 @@ func (a *APK) InitKeyring(ctx context.Context, keyFiles, extraKeyFiles []string)

var asURL *url.URL
var err error
if strings.HasPrefix(element, "https://") {
if strings.HasPrefix(element, "https://") || strings.HasPrefix(element, "s3://") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about https://gocloud.dev/howto/blob/ instead here? This could give us a standard interface for non-S3 APIs. AFAIK GCS's S3 compatibility is "best effort" which means not great.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be fine to swap to if you'd prefer, though as I understand it pretty all blob storage implement the basic S3 APIs with full compatibility, it's as soon as you start using the more complex features like ACLs, CORS, etc that you start to run into issues. Given this is only ever calling GetObject I wouldn't expect to run into those issues.

@@ -393,6 +393,15 @@ func (a *APK) InitKeyring(ctx context.Context, keyFiles, extraKeyFiles []string)
if err != nil {
return fmt.Errorf("failed to read apk key: %w", err)
}
case "s3": //nolint:goconst
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems interesting, but I'm a bit worried about adding features that might move compatibility away from apk add and other non-go-apk tools.

@lyoung-confluent
Copy link
Author

lyoung-confluent commented Apr 16, 2024

Just curious: is there a specific use case you're interested in supporting with this, that this is blocking? This could be useful, but if we're not blocked on it I might like to consider it for a little longer.

It's not blocking, we're building/using a fork of apko with this dependency replaced via go.mod to get the functionality right now, it would just be convenient to have upstream as well. I understand the apprehension around diverging from apk-tools though.

The use-case is we build APK packages internally (using melange) and publish them to an S3 bucket and then assemble a variety of images using those packages from S3 via apko. These are internal/private source code or binaries that will only be baked into images, not installed on the fly at runtime. As such we don't expect/need to support fetching the packages via apk-tools or HTTPS (mostly because authentication in apk-tools is limited to just basic authentication).

We experimented with using mountpoint-s3 and mounting the bucket as a packages directory then using apko as-is but that didn't work as it needs to seek on the files. Using s3fs-fuse does seem to work but it's not very friendly/easy to use for our developers on macOS so we looked at adding the support natively which is also more performant.

@@ -988,6 +997,8 @@ func packageAsURI(pkg InstallablePackage) (uri.URI, error) {

if strings.HasPrefix(u, "https://") {
return uri.Parse(u)
} else if strings.HasPrefix(u, "s3://") {
return uri.URI(u), nil
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth calling out that this line has a bit of code-smell, uri.Parse will fail for the s3 schema: https://go.dev/play/p/PlTtp5oC7S3

This line currently just directly casts to the underlying uri.URI type to avoid this which isn't great

@lyoung-confluent
Copy link
Author

@imjasonh An alternative design that might be more palatable is to adopt the pattern used by Terraform via https://github.com/hashicorp/go-getter (or maybe even just swap to using this library) where a custom prefix of s3:: or gcs:: can be added before the https:// when providing a URL, ex:

s3::https://s3.amazonaws.com/bucket/foo
gcs::https://www.googleapis.com/storage/v1/bucket

That could allow us to still specify the raw https://... URL in the apk repositories file. It's more of a build-time hint to use authentication when accessing the URL still over HTTP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants