Skip to content

DrSnowbird/docker-spark-bde2020-zeppelin

Repository files navigation

Zeppelin for Spark + Hadoop (optional Hive)

This Dockerfile only build Zeppelin which has dependency on the rest of other dockers in "docker-spark-bde2020" including Hadoop, Spark, Hive, etc.

Pull Image

You can get from https://hub.docker.com/r/openkbs/docker-spark-bde2020-zeppelin/

docker pull openkbs/docker-spark-bde2020-zeppelin

Build (if you want to build your own)

To build,

./build.sh

Run - "Zeppelin" Only

docker-compose -f docker-compose-hive.yml up -d zeppelin

Run - The entire suite - Hadoop + Spark + (Hive) + Zeppelin + SparkNotebook + Hue

There two options to run the entire suite of "docker-spark-bde2020"

  • start-hadoop-spark-workbench.sh (no Hive support)
  • start-hadoop-spark-workbench-with-hive.sh (with Hive support)

For example, to start the entire "docker-spark-bde2020 and zeppelin with Hive support:

./start-hadoop-spark-workbench-with-hive.sh

For example, to start the entire "docker-spark-bde2020 and zeppelin without Hive support:

./start-hadoop-spark-workbench.sh

Reference to BDE2020 projects

To see how this Container work with with the entire big-data-europe/docker-hadoop-spark-workbench, go to "./example-docker-spark-bde2020" directory to explore the entire suite build.

Docs

** For example usage see docker-compose.yml and SANSA-Notebooks repository.

See

See big-data-europe/docker-spark README.