Singularity image for Apache Spark with the sparklyr package installed. It was built on top of the base Singularity image nickjer/singularity-rstudio in order to launch an RStudio Server to more easily connect with an Apache Spark cluster running in Standalone Mode.
This is still a work in progress.
You can build a local Singularity image named singularity-rstudio-spark.simg
with:
sudo singularity build singularity-rstudio-spark.simg Singularity
Instead of building it yourself you can download the pre-built image from Singularity Hub with:
singularity pull --name singularity-rstudio-spark.simg shub://nickjer/singularity-rstudio-spark
You can launch Spark in Standalone Mode by first launching a "master" process
which will print out a spark://HOST:PORT
for itself, which you can then use
to connect "workers" to it.
You can launch a "master" process as a Singularity app with:
singularity run --app spark-master singularity-rstudio-spark.simg
You can launch a "worker" process as a Singularity app with:
singularity run --app spark-worker singularity-rstudio-spark.simg
See nickjer/singularity-rstudio for more information on how to run rserver
from within this Singularity image.
See nickjer/singularity-r for more information on how to run R
and
Rscript
from within this Singularity image.
Bug reports and pull requests are welcome on GitHub at https://github.com/nickjer/singularity-rstudio-spark.
The code is available as open source under the terms of the MIT License.