Skip to content

PierreKieffer/docker-spark-yarn-cluster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Docker hadoop yarn cluster for spark 2.4.1

Provides Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn.

Usage

Build

make build

Run

make start

Stop

make stop

Connect to Master Node

make connect
 ---- MASTER NODE ---- 
root@cluster-master:/#

Run spark applications on cluster :

Once connected to the master node

spark-shell

spark-shell --master yarn --deploy-mode client

spark submit

spark-submit --master yarn --deploy-mode [client or cluster] --num-executors 2 --executor-memory 4G --executor-cores 4 --class org.apache.spark.examples.SparkPi $SPARK_HOME/examples/jars/spark-examples_2.11-2.4.1.jar

Web UI

  • Get master node ip:
make master-ip
 ---- MASTER NODE IP ---- 
Master node ip : 172.20.0.4
  • Access to Hadoop cluster Web UI : master-node-ip:8088
  • Access to spark Web UI : master-node-ip:8080
  • Access to hdfs Web UI : master-node-ip:50070

About

Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published