Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: meltano state list fails for the Google Cloud Storage backend if there are files at the root of the bucket #8425

Open
edgarrmondragon opened this issue Feb 26, 2024 · 0 comments

Comments

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Feb 26, 2024

Meltano Version

3.3.1

Python Version

NA

Bug scope

CLI (options, error messages, logging, etc.)

Operating System

NA

Description

For example, if the contents of a bucket look like this

my-bucket
├── my_file.txt
└── state
    ├── state-id-1
    │   └── state.json
    └── state-id-2
        └── state.json

then meltano state list with state_backend.uri set to gs://my-bucket/state will fail with a ValueError: not enough values to unpack in

(state_id, filename) = filepath.split("/")[-2:]

because bucket contents are iterated at the root of the bucket, without considering the configured prefix:

for blob in self.client.list_blobs(bucket_or_name=self.bucket): # noqa: WPS526


The fix probably involves two things:

  1. Skipping the blob path if it doesn't contain a /
  2. Passing prefix=... to list_blobs

We might also want to ensure the prefix doesn't have a leading slash.

PS: https://github.com/dagster-io/dagster/blob/0759886a3a66ad9e5898f7da270056a80602e66c/python_modules/libraries/dagster-gcp/dagster_gcp/gcs/gcs_fake_resource.py#L56 looks like a good example of a dummy client implementation to use for testing.

Code

No response

@edgarrmondragon edgarrmondragon changed the title bug: meltano state list fails for the Google Cloud Storage backend if there's files at the root of the bucket bug: meltano state list fails for the Google Cloud Storage backend if there are files at the root of the bucket May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant