Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running simple SELECT over S3 attached DuckDB db segfaults #11919

Open
2 tasks done
dforsber opened this issue May 3, 2024 · 3 comments
Open
2 tasks done

Running simple SELECT over S3 attached DuckDB db segfaults #11919

dforsber opened this issue May 3, 2024 · 3 comments

Comments

@dforsber
Copy link

dforsber commented May 3, 2024

What happens?

v0.10.2 1601d94f94
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D ATTACH 's3://boilingdata-demo/demo.duckdb' AS demos3 (READ_ONLY);
HTTP Error: HTTP GET error on 'https://boilingdata-demo.s3.amazonaws.com/demo.duckdb' (HTTP 403)
D CREATE SECRET (
      TYPE S3,
      PROVIDER CREDENTIAL_CHAIN
  );
┌─────────┐
│ Success │
│ boolean │
├─────────┤
│ true    │
└─────────┘
D ATTACH 's3://boilingdata-demo/demo.duckdb' AS demos3 (READ_ONLY);
D use demos3;
D show tables;
┌─────────┐
│  name   │
│ varchar │
├─────────┤
│ demo    │
└─────────┘
D select * from demo LIMIT 10;
[1]    39738 segmentation fault  duckdb

To Reproduce

See above. The demo.duckdb file was created with the same DuckDB version and then uploaded to S3.

duckdb demo.duckdb -s "CREATE TABLE demo AS SELECT * FROM parquet_scan('s3://boilingdata-demo/demo.parquet');"

OS:

OSX

DuckDB Version:

0.10.2

DuckDB Client:

command line

Full Name:

Dan Forsberg

Affiliation:

BoilingData

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have not tested with any build

Did you include all relevant data sets for reproducing the issue?

No - Other reason (please specify in the issue body)

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have
@dforsber dforsber changed the title Running simple SELECT over S3 attached DuckDB db segfaults Running simple SELECT over S3 attached DuckDB db segfaults May 3, 2024
@MPizzotti
Copy link

probably this issue is related to the same issue reported here (issue 11845)

@szarnyasg
Copy link
Collaborator

Hi @dforsber, thanks for raising this issue. The S3 bucket is set to private, so I get a 403 error - can you please share the demo.duckdb file with us?

@dforsber
Copy link
Author

dforsber commented May 4, 2024

You can download the file from here (zstd compressed):
https://isecure.fi/demo.duckdb.zst

I actually found more information about this issue. It seems to happen only when I first try to rely on the existing ~/.duckdbrc S3 credentials (see below), which does not work and I get HTTP 403. Then I use CREATE SECRET and it allows me to mount the db, but fails to read.

If I use the CREATE SECRET directly, then it works.

-- Loading resources from /Users/dforsber/.duckdbrc
v0.10.2 1601d94f94
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D ATTACH 's3://boilingdata-demo/demo.duckdb' AS demos3 (READ_ONLY);
HTTP Error: HTTP GET error on 'https://boilingdata-demo.s3.amazonaws.com/demo.duckdb' (HTTP 403)
D CREATE SECRET (
      TYPE S3,
      PROVIDER CREDENTIAL_CHAIN
  );
┌─────────┐
│ Success │
│ boolean │
├─────────┤
│ true    │
└─────────┘
D ATTACH 's3://boilingdata-demo/demo.duckdb' AS demos3 (READ_ONLY);
D use demos3;
D SELECT * FROM demo LIMIT 10;
[1]    55288 segmentation fault  duckdb
dforsber@MacBook-Pro ➜  duckdb
-- Loading resources from /Users/dforsber/.duckdbrc
v0.10.2 1601d94f94
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D CREATE SECRET (
      TYPE S3,
      PROVIDER CREDENTIAL_CHAIN
  );
┌─────────┐
│ Success │
│ boolean │
├─────────┤
│ true    │
└─────────┘
D ATTACH 's3://boilingdata-demo/demo.duckdb' AS demos3 (READ_ONLY);
D use demos3;
D SELECT * FROM demo LIMIT 10;
100% ▕████████████████████████████████████████████████████████████▏ 
┌──────────┬──────────────────────────┬──────────────────────────┬─────────────────┬───────────────┬────────────┬────────────────────┬──────────────┬──────────────┬──────────────┬─────────────┬────────┬─────────┬────────────┬──────────────┬───────────────────────┬──────────────┬──────────────────────┐
│ VendorID │   tpep_pickup_datetime   │  tpep_dropoff_datetime   │ passenger_count │ trip_distance │ RatecodeID │ store_and_fwd_flag │ PULocationID │ DOLocationID │ payment_type │ fare_amount │ extra  │ mta_tax │ tip_amount │ tolls_amount │ improvement_surcharge │ total_amount │ congestion_surcharge │
│  int32   │ timestamp with time zone │ timestamp with time zone │      int32      │    double     │   int32    │      varchar       │    int32     │    int32     │    int32     │   double    │ double │ double  │   double   │    double    │        double         │    double    │        double        │
├──────────┼──────────────────────────┼──────────────────────────┼─────────────────┼───────────────┼────────────┼────────────────────┼──────────────┼──────────────┼──────────────┼─────────────┼────────┼─────────┼────────────┼──────────────┼───────────────────────┼──────────────┼──────────────────────┤
│        1 │ 2019-03-01 02:00:00+02   │ 2019-03-01 02:07:54+02   │               1 │           1.6 │          1 │ N                  │          113 │          137 │            1 │         7.5 │    3.0 │     0.5 │       2.25 │          0.0 │                   0.3 │        13.55 │                  2.5 │
│        1 │ 2019-03-01 02:00:00+02   │ 2019-03-01 02:14:44+02   │               2 │           3.0 │          1 │ N                  │          186 │          140 │            1 │        12.5 │    3.0 │     0.5 │       3.25 │          0.0 │                   0.3 │        19.55 │                  2.5 │
│        1 │ 2019-03-01 02:00:01+02   │ 2019-03-01 02:05:49+02   │               1 │           2.2 │          1 │ N                  │           87 │           33 │            1 │         8.0 │    3.0 │     0.5 │       2.35 │          0.0 │                   0.3 │        14.15 │                  2.5 │
│        1 │ 2019-03-01 02:00:02+02   │ 2019-03-01 02:03:19+02   │               2 │           0.7 │          1 │ N                  │          249 │           79 │            1 │         4.5 │    3.0 │     0.5 │       1.65 │          0.0 │                   0.3 │         9.95 │                  2.5 │
│        1 │ 2019-03-01 02:00:03+02   │ 2019-03-01 02:03:09+02   │               1 │           0.6 │          1 │ N                  │          166 │          151 │            2 │         4.5 │    0.5 │     0.5 │        0.0 │          0.0 │                   0.3 │          5.8 │                  0.0 │
│        1 │ 2019-03-01 02:00:03+02   │ 2019-03-01 02:15:39+02   │               1 │           4.2 │          1 │ N                  │           13 │          100 │            1 │        15.5 │    3.0 │     0.5 │       3.85 │          0.0 │                   0.3 │        23.15 │                  2.5 │
│        1 │ 2019-03-01 02:00:03+02   │ 2019-03-01 02:43:11+02   │               1 │           9.9 │          1 │ N                  │           90 │          226 │            1 │        36.5 │    3.0 │     0.5 │        1.0 │          0.0 │                   0.3 │         41.3 │                  2.5 │
│        1 │ 2019-03-01 02:00:05+02   │ 2019-03-01 02:03:15+02   │               1 │           0.6 │          1 │ N                  │          186 │           90 │            1 │         4.5 │    3.0 │     0.5 │        1.0 │          0.0 │                   0.3 │          9.3 │                  2.5 │
│        1 │ 2019-03-01 02:00:05+02   │ 2019-03-01 02:07:20+02   │               1 │           1.7 │          1 │ N                  │           48 │          239 │            1 │         7.5 │    3.0 │     0.5 │        1.0 │          0.0 │                   0.3 │         12.3 │                  2.5 │
│        1 │ 2019-03-01 02:00:05+02   │ 2019-03-01 02:24:01+02   │               1 │           3.3 │          1 │ N                  │          113 │          256 │            1 │        17.5 │    3.0 │     0.5 │       4.25 │          0.0 │                   0.3 │        25.55 │                  2.5 │
├──────────┴──────────────────────────┴──────────────────────────┴─────────────────┴───────────────┴────────────┴────────────────────┴──────────────┴──────────────┴──────────────┴─────────────┴────────┴─────────┴────────────┴──────────────┴───────────────────────┴──────────────┴──────────────────────┤
│ 10 rows                                                                                                                                                                                                                                                                                         18 columns │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
D 

And I have some settings already in place on ~/.duckdbrc.

% cat ~/.duckdbrc
SET enable_http_metadata_cache = true;
SET enable_object_cache = true;
SET s3_access_key_id='...';
SET s3_secret_access_key='...';
SET s3_region='eu-west-1';
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants