Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: What is the current state of data lineage? #1099

Closed
jsnowacki opened this issue May 15, 2021 · 7 comments
Closed

Question: What is the current state of data lineage? #1099

jsnowacki opened this issue May 15, 2021 · 7 comments

Comments

@jsnowacki
Copy link
Contributor

I went through the current information about data lineage and it is hard to grasp the current state of the feature. Roadmap vaguely indicates that the feature is ongoing. Some issues like #163 are closed, some like #69 are still open, and throughout documentation some mentions of lineage are done, mostly with Apache Atlas, but not clear information is given.

My current understanding is that the development, especially one based on native graph database approach is still ongoing.

My main questions are:

  1. When the lineage will be usable/ready?
  2. Can we somehow setup and test it out currently?
  3. Which metadata backend/proxy we need to use to have the lineage working? (it is not clear if the current gremlin-based backends are general or there are some technology specific, like Neo4j, features only)
  4. Do Atlas lineage is or is not useable/visible in the frontend?
@verdan
Copy link
Member

verdan commented May 15, 2021

Hi @jsnowacki , sorry for all the confusion. We usually update the community via Slack, and through the community meetings. The best way to get the response quickly is indeed Slack 🙂
To answer your questions:

When the lineage will be usable/ready?

The list-based view of lineage is already available in the Amundsen and is being used by a few companies.
This is where you need to enable this in the frontend: frontend config
In terms of backend, you can inject your own 3rd party lineage tool via Metadata or you can use databuilder to ingest your own lineage in Neo4j. More details about the implementation in this commit

Can we somehow setup and test it out currently?

Again, the list-based lineage you can already use for production systems. The graph-based is currently in progress, but please feel free to checkout this branch verdan/lineage-graph-v1.

Which metadata backend/proxy we need to use to have the lineage working? (it is not clear if the current gremlin-based backends are general or there are some technology specific, like Neo4j, features only)

For now, Amundsen supports native lineage for Neo4j, but Atlas is a WIP (this branch: feat/atlas_lineage)

Do Atlas lineage is or is not useable/visible in the frontend?

This is a Work In Progress. branch: feat/atlas_lineage. @mgorsk1 should be able to respond more accurately about the current state of this implementation.

@mgorsk1
Copy link
Contributor

mgorsk1 commented May 16, 2021

@jsnowacki as for Atlas lineage the implementation is already finished and available on feat/atlas_lineage branch if you want to take a look. I have one outstanding PR (atlas dashboard support) to merge and this will be the next in line.

@jsnowacki
Copy link
Contributor Author

@verdan and @mgorsk1 thanks a lot for the answers! I'll look into it and come back to you if there are any more questions.

@verdan verdan pinned this issue May 18, 2021
@stale
Copy link

stale bot commented Jun 2, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale label Jun 2, 2021
@verdan
Copy link
Member

verdan commented Jun 16, 2021

@jsnowacki please feel free to reopen if you have any further questions here.

@stale stale bot removed the stale label Jun 16, 2021
@verdan verdan closed this as completed Jun 16, 2021
@tooptoop4
Copy link

@verdan where is lineage-graph-v1 branch?

@cheji02
Copy link

cheji02 commented Mar 22, 2022

@mgorsk1, has feat/atlas_lineage branch been merged to the main branch? also is there a good instruction for customer build/deploy (with configuration change)? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants