Skip to content

nickjer/singularity-rstudio-spark

Repository files navigation

Singularity Apache Spark w/ RStudio Server

Singularity Hub GitHub License

Singularity image for Apache Spark with the sparklyr package installed. It was built on top of the base Singularity image nickjer/singularity-rstudio in order to launch an RStudio Server to more easily connect with an Apache Spark cluster running in Standalone Mode.

This is still a work in progress.

Build

You can build a local Singularity image named singularity-rstudio-spark.simg with:

sudo singularity build singularity-rstudio-spark.simg Singularity

Deploy

Instead of building it yourself you can download the pre-built image from Singularity Hub with:

singularity pull --name singularity-rstudio-spark.simg shub://nickjer/singularity-rstudio-spark

Run

You can launch Spark in Standalone Mode by first launching a "master" process which will print out a spark://HOST:PORT for itself, which you can then use to connect "workers" to it.

Spark Master

You can launch a "master" process as a Singularity app with:

singularity run --app spark-master singularity-rstudio-spark.simg

Worker

You can launch a "worker" process as a Singularity app with:

singularity run --app spark-worker singularity-rstudio-spark.simg

RStudio Server

See nickjer/singularity-rstudio for more information on how to run rserver from within this Singularity image.

R and Rscript

See nickjer/singularity-r for more information on how to run R and Rscript from within this Singularity image.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/nickjer/singularity-rstudio-spark.

License

The code is available as open source under the terms of the MIT License.

About

Apache Spark with RStudio and the sparklyr package in a Singularity container

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages