Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lineage is not published to Purview - START events missing environment-properties #236

Open
gerson23 opened this issue Apr 4, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@gerson23
Copy link

gerson23 commented Apr 4, 2024

Describe the bug
START events are skipped in OpenLineageIn function because they are missing the environment-properties field. This causes COMPLETE events to not be processed by PurviewOut function, therefore no lineage is published to Purview.

However, as discussed in OpenLineage/OpenLineage#2203, this can be considered a valid scenario, because OL model is cumulative, so the following RUNNING event should have the environment-properties information on top of that.

Expected behavior
START events, even when missing environment-properties field should be accepted. RUNNING events should be accepted as well and used to fill the information from environment-properties when they have it. Then, a COMPLETE event can be properly processed and lineage be published to Purview.

Environment

  • OpenLineage Version: OL 1.5.0+ (I changed parameter for the newer version)
  • Databricks Runtime Version: 14.3
  • Cluster Type: Job
@gerson23 gerson23 added the bug Something isn't working label Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant