Skip to content

Latest commit

 

History

History

kfp_ray_components

KFP components

All data processing pipelines have the same shape. They all compute execution parameters, create Ray cluster, execute Ray job and then delete the cluster. With the exception of computing execution parameters all of the steps, although receiving different parameters are identical.

To simplify implementation of the data processing KFP pipelines, this directory provides several components.

As defined by KFP documentation

A pipeline component is a self-contained set of code that performs one step in a workflow. 

The first step in creation of components its implementation. The framework automation includes the following 3 components:

Once the components are implemented we also implement their interfaces as a component specification which defines:

  • The component’s inputs and outputs.
  • The container image that your component’s code runs in, the command to use to run your component’s code, and the command-line arguments to pass to your component’s code.
  • The component’s metadata, such as the name and description.

Components specifications are provided here:

Building the docker image

To build the component docker image first execute the following commands to set the details of the docker registry as environment variables:

export DOCKER_SERVER=<> # for example us.icr.io 
export DOCKER_USERNAME=iamapikey
export DOCKER_EMAIL=iamapikey
export DOCKER_PASSWORD=<PASSWORD>

As the Docker image utilizes libraries from Python Artifactory, set the Python Artifactory details as environment variables by executing the following commands:

export ARTIFACTORY_USER=<artifactory-user>
export ARTIFACTORY_API_KEY=<artifactory-key>

Then build the image:

make build
make publish