Skip to content

open-datastudio/zeppelin

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Apache Zeppelin on Staroid ⭐

Apache Zeppelin on Staroid.

Run

Key features 🚀

  • Latest version

    Access the latest features and improvements on the Apache Zeppelin, in a single click! No need to install or maintain Zeppelin to keep it up-to-dated.

    Markdown, Shell, Spark, Python, JDBC interpreters are included

  • Kubernetes mode (== Scalable)

    Zeppelin on Kubernetes is enabled by default. Each interpreter runs on their own container.

  • Spark 3.0

    Comes with Spark 3.0.

  • Zero maintenance Spark Cluster

    It configures Spark on Kubernetes out of the box. That means you don't need to configure, manage Spark cluster. Instead, just give number of executors you want in the notebook and enjoy distributed computing without headache.

    %spark.conf
    spark.executor.instances 3
    
    %spark
    // run spark API. 3 instances of worker will be automatically created
    
  • Spark UI access

    Open Spark UI from the notebook and get more insight

  • File manager

    You can upload/download file and access them in /data directory from the interpreter.

  • Customize your self.

    Fork this repository and launch your customized version on Staroid. No complex setup required. Just connect your forked Github repository and push commits.

Spark configuration

Driver and executors can be configured using conf interpreter

%spark.conf
spark.driver.cores                                            2
spark.driver.memory                                           8g
spark.executor.cores                                          4
spark.executor.memory                                         16g
spark.executor.instances                                      3
spark.kubernetes.executor.label.pod.staroid.com/isolation     dedicated
spark.kubernetes.executor.label.pod.staroid.com/instance-type standard-4
spark.kubernetes.executor.label.pod.staroid.com/spot          true

TestDrive

TestDrive namespace quota is smaller. So need to use smaller cores and memory.

%spark.conf
spark.driver.cores                                            1
spark.driver.memory                                           1g
spark.executor.cores                                          1
spark.executor.memory                                         1g

Branch

Branch Zeppelin version
master-snapshot latest master

Development

Feedbacks and Pull Requests are welcome. This repository only edits .staroid directory. For the update on any other files, please make a PullRequest to Apache Zeppelin upstream repository directly.

Check out .staroid directory of master-snapshot branch to learn how it made Zeppelin on Staroid.

contents description
staroid.yaml staroid config file
skaffold.yaml skaffold config file
conf Zeppelin configuration files to override
docker Dockerfile to build images
k8s Kubernetes resource manifests

Run locally with prebuilt image

It takes 30~40 minutes to build all images while Zeppelin requires long time to build. prebuilt profile is included in skaffold.yaml so changes doesn't necessary to re-build Zeppelin can be tested quickly in local minikube environment.

skaffold dev -f .staroid/skaffold.yaml -p prebuilt,minikube --port-forward

Packages

No packages published

Languages

  • Java 59.4%
  • Jupyter Notebook 14.3%
  • JavaScript 10.3%
  • TypeScript 4.0%
  • HTML 3.7%
  • Scala 3.7%
  • Other 4.6%