Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding v2 draft for kubeflow release 1.8 #138

Merged
merged 6 commits into from Nov 1, 2023

Conversation

Davidnet
Copy link
Member

[WIP] Draft of the kubeflow blog release

@thesuperzapper
Copy link
Member

thesuperzapper commented Oct 24, 2023

I think we need to re-add the section about the Kubeflow Notebook images updates from kubeflow/kubeflow#7357, with the key points being:

  1. Support for ARM64 in addition to AMD64:
    • NOTE: the CUDA images are not currently built for ARM64, as I have no way to test them
      • PyTorch: I don't think that pre-compiled versions of PyTorch with CUDA on ARM are available.
      • TensorFlow: The official NVIDIA Ubuntu repos for CUDA are a bit sparse on ARM for CUDA 11.8 (which is the latest that TF supports)
  2. Much cleaner build system and Makefiles:
    • Build any image locally by going to its folder and running make docker-build-dep to build with all base images that that image depends on.
  3. TensorFlow 2.0:
    • We have updated to TensorFlow 2.13.0 by default.
    • We have updated to CUDA 11.8 in the TensorFlow CUDA images.
  4. PyTorch 2.0:
    • We have updated to PyTorch 2.1.0 by default.
    • We have updated to CUDA 12.1 in the PyTorch CUDA images.
  5. JupyterLab 4.0:
  6. Python 3.11:
    • We have updated to Python 3.11.6 by default.

@chensun chensun self-assigned this Oct 26, 2023
Copy link
Member

@chensun chensun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

## **Selected and Highlighted deliveries**


## Kubeflow Pipelines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on the Kubeflow Pipelines section.

@google-oss-prow google-oss-prow bot added the lgtm label Oct 26, 2023
@google-oss-prow google-oss-prow bot removed the lgtm label Oct 26, 2023
@Davidnet
Copy link
Member Author

@DnPlas @kubeflow/release-team

Please take a look

@Davidnet Davidnet marked this pull request as ready for review October 30, 2023 12:45
Copy link

@DnPlas DnPlas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Davidnet ! This is a great piece of documentation. I left some minor notes, but other than that, LGTM!

_posts/2023-10-23-kubeflow-1.8-release.md Outdated Show resolved Hide resolved
_posts/2023-10-23-kubeflow-1.8-release.md Outdated Show resolved Hide resolved
Copy link
Member

@annajung annajung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @Davidnet for the great work with the blog post. I added a few nits and comments to address

_posts/2023-10-23-kubeflow-1.8-release.md Outdated Show resolved Hide resolved

## Kubeflow v1.8’s powerful Python SDKs simplify Kubernetes-native MLOps, reducing manual yaml configuration

Kubeflow v1.8 uniquely delivers Kubernetes-native MLOps via simplified pythonic based workflows, which means less manual yaml, docker and Kubernetes CLI operations. The new workflows simplify ML pipelines building, hyper parameter tuning, distributed model training with GPUs as well as model serving. In addition, many underlying dependencies e.g. Kubernetes, Tensorflow, PyTorch, have been updated, which has improved Kubeflow’s security profile and reduces our enterprise users’ integration work. Kubeflow 1.8 has also added ARM processor support, which addresses a significant portion of the Chinese server market, and also helps adoption by Apple MacBook ARM users.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits

Suggested change
Kubeflow v1.8 uniquely delivers Kubernetes-native MLOps via simplified pythonic based workflows, which means less manual yaml, docker and Kubernetes CLI operations. The new workflows simplify ML pipelines building, hyper parameter tuning, distributed model training with GPUs as well as model serving. In addition, many underlying dependencies e.g. Kubernetes, Tensorflow, PyTorch, have been updated, which has improved Kubeflow’s security profile and reduces our enterprise users’ integration work. Kubeflow 1.8 has also added ARM processor support, which addresses a significant portion of the Chinese server market, and also helps adoption by Apple MacBook ARM users.
Kubeflow v1.8 uniquely delivers Kubernetes-native MLOps via simplified pythonic based workflows, which means less manual yaml, docker and Kubernetes CLI operations. The new workflows simplify ML pipelines building, hyperparameter tuning, distributed model training with GPUs as well as model serving. In addition, many underlying dependencies e.g. Kubernetes, Tensorflow, and PyTorch, have been updated, which has improved Kubeflow’s security profile and reduced our enterprise users’ integration work. Kubeflow 1.8 has also added ARM processor support, which addresses a significant portion of the Chinese server market, and also helps adoption by Apple MacBook ARM users.

_posts/2023-10-23-kubeflow-1.8-release.md Outdated Show resolved Hide resolved
_posts/2023-10-23-kubeflow-1.8-release.md Outdated Show resolved Hide resolved
_posts/2023-10-23-kubeflow-1.8-release.md Outdated Show resolved Hide resolved
_posts/2023-10-23-kubeflow-1.8-release.md Outdated Show resolved Hide resolved
_posts/2023-10-23-kubeflow-1.8-release.md Outdated Show resolved Hide resolved
</td>
</tr>
<tr>
<td><a href="https://knative.dev/docs/reference/relnotes/">Knative</a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instead of having a link to the dependency, would it be better to link the versions like Kubeflow components

_posts/2023-10-23-kubeflow-1.8-release.md Outdated Show resolved Hide resolved
</td>
<td>Tekton
</td>
<td>Oidc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

author: "Kubeflow 1.7 Release Team, David Cardozo and Josh Bottum"
---

# Kubeflow v1.8 Debuts: Official Support for Pipelines v2, Advanced Security, expanded support in architectures and Storage Exploration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this should be removed:

Suggested change
# Kubeflow v1.8 Debuts: Official Support for Pipelines v2, Advanced Security, expanded support in architectures and Storage Exploration


## Kubeflow v1.8’s powerful Python SDKs simplify Kubernetes-native MLOps, reducing manual yaml configuration

Kubeflow v1.8 uniquely delivers Kubernetes-native MLOps via simplified pythonic based workflows, which means less manual yaml, docker and Kubernetes CLI operations. The new workflows simplify ML pipelines building, hyper parameter tuning, distributed model training with GPUs as well as model serving. In addition, many underlying dependencies e.g. Kubernetes, Tensorflow, PyTorch, have been updated, which has improved Kubeflow’s security profile and reduces our enterprise users’ integration work. Kubeflow 1.8 has also added ARM processor support, which addresses a significant portion of the Chinese server market, and also helps adoption by Apple MacBook ARM users.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rewritten this to be more punchy and descriptive:

Suggested change
Kubeflow v1.8 uniquely delivers Kubernetes-native MLOps via simplified pythonic based workflows, which means less manual yaml, docker and Kubernetes CLI operations. The new workflows simplify ML pipelines building, hyper parameter tuning, distributed model training with GPUs as well as model serving. In addition, many underlying dependencies e.g. Kubernetes, Tensorflow, PyTorch, have been updated, which has improved Kubeflow’s security profile and reduces our enterprise users’ integration work. Kubeflow 1.8 has also added ARM processor support, which addresses a significant portion of the Chinese server market, and also helps adoption by Apple MacBook ARM users.
Kubeflow 1.8 delivers leading Kubernetes-native MLOps capabilities.
Kubeflow Pipelines v2 brings simplified and pythonic workflow definitions, meaning less YAML, Docker, and Kubernetes CLI.
Kubeflow Notebooks v1.8 brings significantly updated base images and support for ARM64 processors.
Katib v0.16 brings improved hyperparameter tuning and distributed GPU training capabilities.
Kubeflow 1.8 brings first-class ARM64 support, driving adoption in this growing segment and making Kubeflow easier to use on Apple Silicon devices.
Also, initial support for PPC64 processors has been added.
Many underlying dependencies have been updated, improving Kubeflow’s security posture and making it easier to operate in enterprise environments.

## **Selected and Highlighted deliveries**


## Kubeflow Pipelines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be H3, not H2:

Suggested change
## Kubeflow Pipelines
### Kubeflow Pipelines


Kubeflow v1.8 uniquely delivers Kubernetes-native MLOps via simplified pythonic based workflows, which means less manual yaml, docker and Kubernetes CLI operations. The new workflows simplify ML pipelines building, hyper parameter tuning, distributed model training with GPUs as well as model serving. In addition, many underlying dependencies e.g. Kubernetes, Tensorflow, PyTorch, have been updated, which has improved Kubeflow’s security profile and reduces our enterprise users’ integration work. Kubeflow 1.8 has also added ARM processor support, which addresses a significant portion of the Chinese server market, and also helps adoption by Apple MacBook ARM users.

In Kubeflow v1.8, ML pipelines are now constructed as modular components, enabling easily chainable and reusable ML workflows. v1.8 also delivers model (training) parallelism for large language models and introduces the PVC Viewer for simplified persistent storage management, eliminating the need for Kubernetes CLI storage commands. The new Katib SDK provides a powerful python based solution that reduces manual configuration and simplifies the delivery of your tuned model.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have improved this paragraph.

Also, as we are already talking about the Kubeflow v2 upgrade in the above paragraph, I have removed it from here.

Suggested change
In Kubeflow v1.8, ML pipelines are now constructed as modular components, enabling easily chainable and reusable ML workflows. v1.8 also delivers model (training) parallelism for large language models and introduces the PVC Viewer for simplified persistent storage management, eliminating the need for Kubernetes CLI storage commands. The new Katib SDK provides a powerful python based solution that reduces manual configuration and simplifies the delivery of your tuned model.
Kubeflow 1.8 delivers new capabilities and components.
The new PVC Viewer simplifies interacting with Kubernetes Volumes, allowing you to manage the contents of PVCs without leaving the Kubeflow UI.
The upgraded Katib Python SDK is a powerful solution to reduce manual configuration and simplifies the delivery of your tuned model.

Additionally the project [kfp-tekton](https://github.com/kubeflow/kfp-tekton) which allows users to run pipelines with a Tekton backend, is also updated version 2.0.3 and the sdk will compile to the same pipeline spec, sdk users can use the same pipeline definition to run on both Argo and Tekton backends.


## Katib:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be H3, not H2:

Suggested change
## Katib:
### Katib

* Remove a katib-webhook-cert Secret from components [(#2214](https://github.com/kubeflow/katib/pull/2214))


## Training Operator:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be H3, not H2:

Suggested change
## Training Operator:
### Training Operator

* Fully consolidate tfjob-operator to training-operator ([#1850](https://github.com/kubeflow/training-operator/pull/1850))


## Kserve
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be H3, not H2 (and capitalization is wrong):

Suggested change
## Kserve
### KServe

Comment on lines 163 to 167
## Updated Images, platform dependencies, breaking changes, and add-ons

Kubeflow 1.8 includes hundreds of commits. The Kubeflow release process includes several rounds of testing by the Kubeflow working groups and Kubeflow distributions. Kubeflow’s configuration options provide a high degree of flexibility. After considering all of the testing options, the 1.8 Release Team narrowed the critical dependencies for consistent testing and documentation to the following.

1.8 includes new notebook images with updates to support multi-architectures including ARM and Power processors, and updates to Tensorflow, PyTorch and other packages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section can be removed, as it says nothing that the above sections don't already say.

Also, the first part is actually a direct copy from a few lines up.

Suggested change
## Updated Images, platform dependencies, breaking changes, and add-ons
Kubeflow 1.8 includes hundreds of commits. The Kubeflow release process includes several rounds of testing by the Kubeflow working groups and Kubeflow distributions. Kubeflow’s configuration options provide a high degree of flexibility. After considering all of the testing options, the 1.8 Release Team narrowed the critical dependencies for consistent testing and documentation to the following.
1.8 includes new notebook images with updates to support multi-architectures including ARM and Power processors, and updates to Tensorflow, PyTorch and other packages.

@DnPlas
Copy link

DnPlas commented Oct 31, 2023

/lgtm

@jbottum
Copy link
Contributor

jbottum commented Nov 1, 2023

@thesuperzapper Hey Mathew - Thanks for all the suggestions to the blog post. They are very helpful. That said, I must call a point of order. We are 24 hours from the release and the suggestions above have many majors changes to content, flow and tone requiring analysis, review and approvals. My request is that in the future, edits at the very end of release cycle should be error corrections or minor incremental modifications. Thanks for understanding.

@google-oss-prow google-oss-prow bot removed the lgtm label Nov 1, 2023
@Davidnet Davidnet requested a review from DnPlas November 1, 2023 14:03
@jbottum
Copy link
Contributor

jbottum commented Nov 1, 2023

thanks @Davidnet ! /lgtm

@DnPlas
Copy link

DnPlas commented Nov 1, 2023

/lgtm

@google-oss-prow google-oss-prow bot added the lgtm label Nov 1, 2023
@DnPlas
Copy link

DnPlas commented Nov 1, 2023

/assign @zijianjoy

@annajung
Copy link
Member

annajung commented Nov 1, 2023

Thanks @Davidnet!
/lgtm

@chensun
Copy link
Member

chensun commented Nov 1, 2023

/approve

Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by: chensun, Davidnet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit c0e351b into kubeflow:master Nov 1, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants