Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Create datasets from dbt models if not present in Superset #42

Open
rohitsanj opened this issue Feb 20, 2024 · 0 comments

Comments

@rohitsanj
Copy link

Hi! I want to start a conversation about having this library also support a feature to create a dataset if one does not exist in Superset, perhaps as part of the push_descriptions command or an entirely new command, say, create_datasets.

Automatically creating datasets would solve for two use-cases:

  1. Ensuring any new dbt models are synced into Superset without having to explicitly create it in Superset itself.
  2. Helps sync existing dbt models into a freshly provisioned Superset instance -- again reduces effort to create the corresponding datasets in Superset (via a separate script or manual actions in the Superset UI)

This PR (rohitsanj#1) against my own fork of this repo introduces a new flag create_dataset_if_not_exists to the push_descriptions command. I've also added in a new folder called dbt_schemas containing the dbt manifest JSON schema and the schema-generated pydantic models -- this is used to parse the dbt manifest.json file to provide helpful type hints when developing and automatic data validation at runtime.

Would love to know the community's thoughts on this and if others have come across the requirement for such a feature. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant