Kubeflow Spark

Orchestrate Spark Jobs using Kubeflow, a modern Machine Learning orchestration framework. Read related blog post.

Requirements

Kubernetes cluster (1.17+)
Kubeflow pipelines (1.7.0+)
Spark Operator (1.1.0+)
Python (3.6+)
kubectl
helm3

Getting started

Run make all to start everything and skip to step 6 or:

Start your local cluster

./scripts/start-minikube.sh

Install Kubeflow Pipelines

./scripts/install-kubeflow.sh

Install Spark Operator

./scripts/install-spark-operator.sh

Create Spark Service Account and add permissions

./scripts/add-spark-rbac.sh

Make Kubeflow UI reachable

a. (Optional) Add Kubeflow UI Ingress

./scripts/add-kubeflow-ui-ingress.sh

b. (Optional) Forward service port, e.g:

kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8005:80

Create Kubeflow pipeline definition file

python kubeflow_pipeline.py

Navigate to the Pipelines UI and upload the newly created pipeline from file spark_job_pipeline.yaml
Trigger a pipeline run. Make sure to set spark-sa as Service Account for the execution.
Enjoy your orchestrated Spark job execution!

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
k8s-apply-component.yaml		k8s-apply-component.yaml
k8s-get-component.yaml		k8s-get-component.yaml
kubeflow_pipeline.py		kubeflow_pipeline.py
requirements.txt		requirements.txt
spark-job-python.yaml		spark-job-python.yaml
spark-job.yaml		spark-job.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scripts

scripts

.gitignore

.gitignore

LICENSE

LICENSE

Makefile

Makefile

README.md

README.md

k8s-apply-component.yaml

k8s-apply-component.yaml

k8s-get-component.yaml

k8s-get-component.yaml

kubeflow_pipeline.py

kubeflow_pipeline.py

requirements.txt

requirements.txt

spark-job-python.yaml

spark-job-python.yaml

spark-job.yaml

spark-job.yaml

Repository files navigation

Kubeflow Spark

Requirements

Getting started

About

Releases

Packages

Languages

License

sbakiu/kubeflow-spark

Folders and files

Latest commit

History

Repository files navigation

Kubeflow Spark

Requirements

Getting started

About

Topics

Resources

License

Stars

Watchers

Forks

Languages