New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a Reusable E2E Kubeflow ML Lifecycle #3728
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a typo, should be Data Producers
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! @franciscojavierarceo I am wondering, should we add the Data Producers to the Offline Feature store as well?
E.g. Spark ingest data from Data Producers and extract features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also may be useful to add a Feature Extraction
to the Offline Store to make it concrete how the offline store is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! @franciscojavierarceo I am wondering, should we add the Data Producers to the Offline Feature store as well?
Yeah, I think that's a great idea! That can get complicated if we're to get specific but I think if we're just generic and create a box like we do for the online store that works fine.
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
@andreyvelich What exactly are you trying to accomplish? I didn't fully get this part
Is this diagram supposed to be re-used by each component, and if so, how do you envision that? |
That's right. Please check these examples: We can do the same for Model Registry, Spark Operator, Notebooks if other WGs agree with that. What do you think about it @StefanoFioravanzo ? |
Oh Ok now I understand your approach, I like this. You are proposing we build a canonical Kubeflow ML lifecycle diagram and then highlight what parts of the diagram each component covers. So, based on this, I propose two things:
If you want to keep the focus smaller and have a quicker iteration on the existing diagram, I am fine with it and you can ignore the two points above. |
cc @chasecadet can probably provide some good insight on this |
@andreyvelich a very good open source diagram that we can reuse is this one by the AI Infrastructure Alliance. See here https://github.com/ai-infrastructure-alliance/blueprints There is no explicit license, by the do write in the README:
I think this would be a pretty good starting point for a reusable diagram. They have an editable figma file, and even an interactive version. Take a look at all the folders, there's various versions. We could fork the repository under the Kubeflow org and adapt it to the various component. If we want we could embed the interactive diagram in our website. If we are unsure about licensing and reusability of that content, I can reach out to a couple of folks at AIIA. |
I can see us doing something similar to this interactive version https://ai-infrastructure-alliance.github.io/blueprints/interactive-stack-diagram/stack.html where each option is one of the Kubeflow components. So you can see how the entire Kubeflow platform (we can have a "all" picker) covers the E2E ML lifecycle or based on |
That makes sense, renamed it. |
To be honest, I have concerns with existing diagram, since it was implemented ~ 5 years ago which is very out-of-date. E.g. it doesn't include model fine-tuning which is the modern approach for model development, and it doesn't have online feature store. WDYT @StefanoFioravanzo @franciscojavierarceo ? |
I like there diagrams, but it looks similar to what we have in this PR, isn't ? E.g. the differences:
Maybe we can improve our diagram with additional stages ? |
I agree the old diagram is outdated. I am much more preferential to a diagram that reflects the view of a Data Scientist and the needs in their workflow, which the diagram you proposed does. The AI Infrastructure Aliiance I think highlights things in a way that highlights the needs for different companies with different structure and, while that's helpful, I don't think that elicits clarity on the value of Kubeflow. |
@StefanoFioravanzo finally getting to this! Before I say too much I'd like to take a step back because as we allll know "tactics without vision is just noise before defeat". I like the idea of an ML diagram. I would love to know what our vision for these documents is and how we are approaching this. Someone reads the diagram they learn X and then start building using Y and deliver Z value to their project/org. Allow me to free associate here a bit on what I think would be interesting. I like the idea of talking about use cases for specific components, but I struggle with the idea of telling users what to do. I want to help them envision using these tools and enable them to creatively solve solutions. Another way to say this is I would love if the users told us what they use these components for in collaboration with our vision for these components. We as a community can provide guidance. If we act as a ground truth authority on use cases we might lose out on the value of new community members using the tools in powerful but unexpected ways we can later integrate into more robust use cases. Questions I'd love to have answers to are:
We can touch on trying to say use KFP without a training operator to attempt to run an XGBOOST job vs using and integrating the training operator to show that you "can" do things in MANY ways but may lose out on overall value trying to redo our engineering efforts through your own means.. That being said, stands on soap box Maybe I missed the point of the CC. I also have a chapter in that class I built on the model dev lifecycle. I officially own the content and we can use it how we see fit to create some MLOPs like documents. |
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@StefanoFioravanzo @franciscojavierarceo I've made a few updates to the lifecycle diagram based on the feedback. |
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
Looks great! |
Based on our recent discussion with @franciscojavierarceo I updated the ML lifecycle diagram in the architecture guides: #3719 (comment)
We can re-use this ML lifecycle diagram in each Kubeflow Component and explain the user value of that component.
I like the existing diagrams, but they little bit out of date.
I am happy to improve my diagrams based on your feedback.
Also, I removed unused images.
/assign @franciscojavierarceo @kubeflow/kubeflow-steering-committee @thesuperzapper @StefanoFioravanzo @hbelmiro
/hold for review