Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Support Python UDF From DBT SQL Model #603

Open
case-k-git opened this issue Mar 2, 2024 · 2 comments
Open

Feature request: Support Python UDF From DBT SQL Model #603

case-k-git opened this issue Mar 2, 2024 · 2 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@case-k-git
Copy link
Contributor

case-k-git commented Mar 2, 2024

Describe the feature

A clear and concise description of what you want to happen.

Using Spark UDF From DBT will be helpful.

As discussing in dbt-spark, Something like using pre_hook will be helpfull.
dbt-labs/dbt-spark#135 (comment)

{{ config( 
pre_hook=['
function custom_df(input)
   # do some logic 
   return output

spark.udf.register('custom_df', custom_df)
']
) }}

select custom_df( x ) from {{ ref('my_table') }}

or

{{ config( 
pre_hook=['dbfs:/scripts/init_functions.py']
) }}

select custom_df( x ) from {{ ref('my_table') }}

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Using DBT Python models. Ideally want to use udf from DBT SQL model
https://docs.getdbt.com/docs/build/python-models

(Advanced) Use dbt Python models in a workflow

https://docs.databricks.com/en/workflows/jobs/how-to/use-dbt-in-workflows.html#advanced-use-dbt-python-models-in-a-workflow

Additional context

Please include any other relevant context here.

Running dbt in production with Python UDFs
https://www.explorium.ai/blog/news-and-updates/running-dbt-production-python-udfs/

Who will this benefit?

What kind of use case will this feature be useful for? Please be specific and provide examples, this will help us prioritize properly.

Some one who want to use UDF.In our company proceed migration from Oracle PL/SQL to Databricks. If we can use udf some function will be easy to migrate.
https://www.databricks.com/blog/how-migrate-your-oracle-plsql-code-databricks-lakehouse-platform

Are you interested in contributing this feature?

Let us know if you want to write some code, and how we can help.

Yes

@case-k-git case-k-git added the enhancement New feature or request label Mar 2, 2024
@case-k-git case-k-git changed the title Feature request: Register And Use Python UDF Feature request: Support Python UDF From DBT SQL Model Mar 2, 2024
@case-k-git
Copy link
Contributor Author

case-k-git commented Mar 4, 2024

Ah may be can we use databricks udf from macro?

Create UDF
※ we also able to register out side from dbt
https://docs.databricks.com/en/udf/unity-catalog.html#register-a-python-udf-to-uc

{% macro greet(a) %}
  CREATE FUNCTION target_catalog.target_schema.greet(s STRING)
  RETURNS STRING
  LANGUAGE PYTHON
  AS $$
    return f"Hello, {s}"
  $$
{% endmacro %}

Read UDF
Add macro function to call udf function.

{% macro get_ greet(a) %}
  `catalog.schema. greet `({{ a }})
{% endmacro %}

Use UDF
use udf crom dbt sql

SELECT
  {{
    get_ greet(
      a="Jone"
    )
  }} AS amount
FROM {{ ref('table') }}

@benc-db benc-db added the help wanted Extra attention is needed label Mar 4, 2024
@Nintorac
Copy link

This might already be possible though I'm not sure

Here is how I've done it with DuckDB

I define a plugin here https://github.com/Nintorac/s4_dx7/blob/main/s4_dx7/udf/__init__.py
and then load it in the profile here https://github.com/Nintorac/s4_dx7/blob/main/s4_dx7_dbt/profiles.yml#L13
And finally can use it eg here https://github.com/Nintorac/s4_dx7/blob/main/s4_dx7_dbt/models/dx7_voices.sql#L3C25-L3C39

what I'm not clear on is

  • how to ship the environment to the databricks cluster
  • are the plugins a duckdb-dbt feature or dbt-core feature

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants