Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] elementary does not appear to work if Unity Catalog not configured #676

Closed
hazvk opened this issue Mar 8, 2024 · 2 comments
Closed

Comments

@hazvk
Copy link

hazvk commented Mar 8, 2024

Context

Recently we were given the option to enable UC in our Databricks workspaces. I haven't enabled it in our workspace (yet) and thus am technically still using the hive_metastore. I started seeing this error - not sure if its correlated.

Note: probably figured by now, this is being done on Databricks.

Steps to reproduce

  1. Set up Databricks SQL warehouse
  2. Follow steps here to: set up App Registration; set up profiles.yml in your dbt project
  3. Set up elementary as per normal
    1. packages.yml
    packages:
      - package: dbt-labs/dbt_utils
        version: 1.1.1
      - package: dbt-labs/dbt_external_tables
        version: 0.8.5
      - package: elementary-data/elementary
        version: 0.9.3
        ## Docs: https://docs.elementary-data.com
      - package: dbt-labs/audit_helper
        version: 0.9.0
    
    1. dbt_project.yml
    name: 'dbt_demo'
    version: '1.0.0'
    config-version: 2
    
    #Global Variables for the Project
    vars:
      unknown_key: -1
    
    # This setting configures which "profile" dbt uses for this project.
    profile: 'dbt_demo'
    
    # These configurations specify where dbt should look for different types of files.
    # The `model-paths` config, for example, states that models in this project can be
    # found in the "models/" directory. You probably won't need to change these!
    model-paths: ["dbt_demo/models"]
    analysis-paths: ["dbt_demo/analyses"]
    test-paths: ["dbt_demo/tests"]
    seed-paths: ["dbt_demo/seeds"]
    macro-paths: ["dbt_demo/macros"]
    snapshot-paths: ["dbt_demo/snapshots"]
    
    clean-targets:         # directories to be removed by `dbt clean`
      - "dbt_demo/target"
      - "dbt_demo/dbt_packages"
    
    
    # Configuring models
    # Full documentation: https://docs.getdbt.com/docs/configuring-models
    
    # In this example config, we tell dbt to build all models in the example/
    # directory as views. These settings can be overridden in the individual model
    # files using the `{{ config(...) }}` macro.
    models:
      dbt_demo:
        # Config indicated by + and applies to all files under models/example/
        some_project:
          +materialized: view
          +file_format: delta
          
          intermediate:
            some_model_type:
              schema: "some_schema"
          
          datamarts:
            +materialized: table
            some_datamart_model:
              schema: "some_mart_schema"
              location_root: "abfss://loc@loc.dfs.core.windows.net/loc/some_mart_schema"
      elementary:
        +schema: "test_stats"
    
    snapshots:
      dbt_demo:
        +target_schema: 'some_snapshot_schema'
    
    
  4. Set up dbt project, deps etc
  5. Run dbt build -f -s elementary

Expected outcome: runs successfully

Actual outcome

Most tests pass, but one doesn't. Snippet of output after it runs:

01:34:56  Finished running 15 incremental models, 1 table model, 14 view models, 2 hooks in 0 hours 1 minutes and 33.93 seconds (93.93s).
01:34:56  
01:34:56  Completed with 1 error and 0 warnings:
01:34:56  
01:34:56    Runtime Error in model information_schema_columns (models/edr/dbt_artifacts/information_schema_columns.sql)
  [UC_NOT_ENABLED] Unity Catalog is not enabled on this cluster.

System

  • OS: MacOS 14.3 (23D56) Sonoma
  • Python: 3.10.2
  • dbt: 1.7.9
  • dbt-databricks: 1.7.8
  • elementary: 0.14.1

Workaround

I'm just not using it, but would be nice if we could. So disabled getting information_schema_columns and the dependent model enriched_columns generated by adding this to dbt_project.yml

  elementary:
    +schema: "elementary"
    # NOTE: disabling the below because they are empty tables but are erroring for projects that don't have unity catalog enabled
    edr:
        dbt_artifacts:
            information_schema_columns:
                enabled: false
            enriched_columns:
                enabled: false
@hazvk
Copy link
Author

hazvk commented Mar 8, 2024

Additional context

  • This is easily solved by moving to UC. However if hive_metastore is not supported, I believe this should be made explicit
  • This was also tested with elementary 0.9.3, following the exact repro steps. This used to work, but now it errors out.
    This gave a different result, but after digging myself I am sure they are related. See snippet of log output (different to what's reported)
01:51:30  Finished running 14 incremental models, 14 view models, 1 table model, 2 hooks in 0 hours 1 minutes and 25.68 seconds (85.68s).
01:51:30  
01:51:30  Completed with 2 errors and 0 warnings:
01:51:30  
01:51:30    Runtime Error in model filtered_information_schema_columns (models/edr/metadata_store/filtered_information_schema_columns.sql)
  [TABLE_OR_VIEW_NOT_FOUND] The table or view `hive_metastore`.`INFORMATION_SCHEMA`.`COLUMNS` cannot be found. Verify the spelling and correctness of the schema and catalog.
  If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
  To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS.; line 25 pos 13
01:51:30  
01:51:30    Runtime Error in model dbt_columns (models/edr/dbt_artifacts/dbt_columns.sql)
  [TABLE_OR_VIEW_NOT_FOUND] The table or view `hive_metastore`.`INFORMATION_SCHEMA`.`COLUMNS` cannot be found. Verify the spelling and correctness of the schema and catalog.
  If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
  To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS.; line 72 pos 13

Note: there is no such entity created called hive_metastore.INFORMATION_SCHEMA.columns - it looks like it assumes it is a UC catalog.

@haritamar
Copy link
Collaborator

Hi @hazvk !
Thanks for opening this issue and very sorry we are just replying now.
This is actually strange since we have routine tests that are running on both modes of Databricks.

In any case though, in the most recent version of the Elementary dbt package we actually removed completely the information_schema_columns and enriched_columns models so I tend to believe this issue is no longer relevant.

If you still encounter it, please feel free to open a new issue in the main Elementary repo - we are trying to concentrate issues there so it will be easier to manage and prioritize.

Thank you,
Itamar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants