Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow custom configs for s3 #2849

Merged
merged 4 commits into from May 9, 2024
Merged

allow custom configs for s3 #2849

merged 4 commits into from May 9, 2024

Conversation

activesoull
Copy link
Contributor

🚀 🚀 Pull Request

Impact

  • Bug fix (non-breaking change which fixes expected existing functionality)
  • Enhancement/New feature (adds functionality without impacting existing logic)
  • Breaking change (fix or feature that would cause existing functionality to change)

for the cases if the sure has a V4 signing requirement set on the storage, with the functionalities where the user is trying to connect to the dataset or create a dataset need to provide a custom config

import deeplake
from botocore.config import Config

creds = dict(.....)
cfg = Config(
    signature_version = 's3v4'
)
creds["config"] = cfg
deeplake.empty(path, creds=creds)

Copy link
Contributor

@hoshimura hoshimura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you live with these changes? in the s3.py S3Provider class

    def _get_bytes(
        self, path, start_byte: Optional[int] = None, end_byte: Optional[int] = None
    ):
        range_kwarg = {}
        if start_byte is not None and end_byte is not None:
            if start_byte == end_byte:
                return b""
            range_kwarg["Range"] = f"bytes={start_byte}-{end_byte - 1}"
        elif start_byte is not None:
            range_kwarg["Range"]  = f"bytes={start_byte}-"
        elif end_byte is not None:
            range_kwarg["Range"]  = f"bytes=0-{end_byte - 1}"
        resp = self.client.get_object(Bucket=self.bucket, Key=path, **range_kwarg)
        return resp["Body"].read()

instead of the old

    def _get_bytes(
        self, path, start_byte: Optional[int] = None, end_byte: Optional[int] = None
    ):
        if start_byte is not None and end_byte is not None:
            if start_byte == end_byte:
                return b""
            range = f"bytes={start_byte}-{end_byte - 1}"
        elif start_byte is not None:
            range = f"bytes={start_byte}-"
        elif end_byte is not None:
            range = f"bytes=0-{end_byte - 1}"
        else:
            range = ""
        resp = self.client.get_object(Bucket=self.bucket, Key=path, Range=range)
        return resp["Body"].read()

Copy link

sonarcloud bot commented May 9, 2024

@activesoull activesoull merged commit 2f22e5f into main May 9, 2024
7 of 10 checks passed
@activesoull activesoull deleted the s3_custom_config branch May 9, 2024 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants