Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

duckdb sql read_...( ) functions not working #2241

Open
seandavi opened this issue Mar 9, 2024 · 1 comment
Open

duckdb sql read_...( ) functions not working #2241

seandavi opened this issue Mar 9, 2024 · 1 comment

Comments

@seandavi
Copy link

seandavi commented Mar 9, 2024

Version fresh checkout of #2240 (0.76.0)

config:

gateways:
    local:
        connection:
            type: duckdb
            database: db.db
            extensions:
                - httpfs
                - parquet
            connector_config:
                s3_endpoint: "https://storage.googleapis.com"
                s3_access_key_id: "***"
                s3_secret_access_key: "***"
default_gateway: local

model_defaults:
    dialect: duckdb

Model:

MODEL (
  name sqlmesh_example.full_model,
  kind FULL,
);

SELECT *
FROM read_ndjson_auto('s3://omicidx-json/prefect-testing/geo/gpl-2007*')

sqlmesh plan result:

New environment `prod` will be created from `prod`
Summary of differences against `prod`:
Models:
└── Added:
    ├── sqlmesh_example.incremental_model
    ├── sqlmesh_example.full_model
    └── sqlmesh_example.seed_model
Models needing backfill (missing dates):
├── sqlmesh_example.full_model: 2024-03-08 - 2024-03-08
├── sqlmesh_example.incremental_model: 2020-01-01 - 2024-03-08
└── sqlmesh_example.seed_model: 2024-03-08 - 2024-03-08
Apply - Backfill Tables [y/n]: y
Creating physical tables ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0% • pending • 0:00:00
2024-03-08 20:51:08,087 - MainThread - sqlmesh.core.context - ERROR - Apply Failure: Traceback (most recent call last):
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/utils/concurrency.py", line 225, in sequential_apply_to_dag
    fn(node)
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/utils/concurrency.py", line 163, in <lambda>
    lambda s_id: fn(snapshots_by_id[s_id]),
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/snapshot/evaluator.py", line 274, in <lambda>
    lambda s: self._create_snapshot(s, snapshots, deployability_index, on_complete),
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/snapshot/evaluator.py", line 630, in _create_snapshot
    evaluation_strategy.create(
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/snapshot/evaluator.py", line 1088, in create
    self.adapter.ctas(
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/engine_adapter/shared.py", line 265, in internal_wrapper
    return func(*list_args, **kwargs)
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/engine_adapter/base.py", line 442, in ctas
    return self._create_table_from_source_queries(
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/engine_adapter/base.py", line 629, in _create_table_from_source_queries
    self._create_table(
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/engine_adapter/base.py", line 665, in _create_table
    self.execute(
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/engine_adapter/base.py", line 1811, in execute
    self._execute(sql, **kwargs)
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/engine_adapter/base.py", line 1817, in _execute
    self.cursor.execute(sql, **kwargs)
duckdb.duckdb.IOException: IO Error: Connection error for HTTP GET to '//storage.googleapis.com/?encoding-type=url&list-type=2&prefix=prefect-testing%2Fgeo%2Fgpl-2007'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/context.py", line 1084, in apply
    self._scheduler.create_plan_evaluator(self).evaluate(
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/plan/evaluator.py", line 104, in evaluate
    self._push(plan, deployability_index_for_creation)
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/plan/evaluator.py", line 188, in _push
    self.snapshot_evaluator.create(
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/core/snapshot/evaluator.py", line 272, in create
    concurrent_apply_to_snapshots(
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/utils/concurrency.py", line 161, in concurrent_apply_to_snapshots
    return concurrent_apply_to_dag(
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/utils/concurrency.py", line 196, in concurrent_apply_to_dag
    return sequential_apply_to_dag(dag, fn, raise_on_error)
  File "/Users/seandavis/Library/Caches/pypoetry/virtualenvs/sqlm-Glh99G37-py3.9/lib/python3.9/site-packages/sqlmesh/utils/concurrency.py", line 228, in sequential_apply_to_dag
    raise NodeExecutionFailedError(node) from ex
sqlmesh.utils.concurrency.NodeExecutionFailedError: Execution failed for node SnapshotId<"db"."sqlmesh_example"."full_model": 1768159110>
 (context.py:1091)
@izeigerman
Copy link
Member

Hey @seandavi! QQ: how do you authenticate with S3 in this case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants