Skip to content

mrsrinivas/spark-bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spark-bench Build Status

Build

    $ git clone https://github.com/mrsrinivas/spark-bench.git
    $ cd spark-bench
    $ mvn install

Run

Run DataGen Spark application on YARN cluster

    $ nohup spark2-submit \
        --master yarn \
        --executor-cores 2 \
        --num-executors 30 \
        --driver-memory 2g \
        --executor-memory 4g \
        --class com.mrsrinivas.app.DataGen \
        ./target/spark-bench-1.0-fat.jar  \ 
        100G \
        30 \
        file:///scratch/username/datagen_in > spark-submit.log &
    
    [1] 11069
    $ nohup: ignoring input and redirecting stderr to stdout
    
    tail -f spark-submit.log
        

Once the job is successful, the output directory should have following sub directories

    $ cd /scratch/username/datagen_in
    $ ls
    employees	stage-metrics