Skip to content
Jean-Francois Rajotte edited this page Nov 21, 2016 · 9 revisions

#Installation

  1. Install Scala: 2.11.2 (http://www.scala-lang.org/download/) Be sure to add scala/bin folder to your environment.

  2. Install Maven 2.0x+ at https://maven.apache.org/download.cgi and be sure to add Maven to your environment.

  3. Download SBT: v0.13.5+ http://www.scala-sbt.org/download.html

  4. Install Spark 2.0.0 which has a Scala 2.11.2 dependency at http://spark.apache.org/downloads.html.

  5. Add the installation folder to your environment SPARK_HOME = /path/to/installation

  6. Download the latest version of SciSpark from https://github.com/SciSpark/SciSpark

  7. Within your SciSpark folder, run sbt clean assembly

  8. Find where your SciSpark.jar (or similarly named) file is and get its path as follows /path_to_SciSpark/target/scala-2.11/SciSpark.jar. To build SciSpark for different Spark and Scala version combinations please see the NOTE at the bottom.

  9. Download and untar Zeppelin 0.5.6 at https://zeppelin.incubator.apache.org/download.html

  10. Find zeppelin-env.sh.template in Zeppelin's conf folder and create zeppelin-env.sh with the following command:

    cp zeppelin-env.sh.template
    
  11. Point your configuration to your SciSpark jar file by adding the following to zeppelin-env.sh:

    export ZEPPELIN_JAVA_OPTS="-Dspark.jars=/path/to/SciSpark.jar"
    export SPARK_SUBMIT_OPTIONS="--jars /path/to/SciSpark.jar"
    
  12. Start Zeppelin:

    bin/zeppelin-daemon.sh start
    
  13. Open your local configuration (localhost:8080/#) and create a new note. Paste the following into the first cell:

    //SciSpark imports
    import org.dia.Parsers
    import org.dia.core.{ SciSparkContext, SciTensor }
    import org.dia.algorithms.mcs.MCSOps
    import org.dia.urlgenerators.{RandomDatesGenerator}
    
  14. Run this note. If it works, your configuration is set up correctly.

  15. Now, we want to change the skin of our notebook to have a SciSpark theme. This can be done by downloading a zip file of the Zeppelin web repo at https://github.com/SciSpark/scispark_zeppelin_web. Then, go to your zeppelin installation and replace all folders under webapps/webapp/ with the folders of the same name under your web installation's src folder.

Possible pitfalls:

Your computer may cache some of your web files, resulting in a page that does not display the SciSpark skin correctly. If you suspect this is the case, you can reset the cache with command + shift + R (on Mac).

NOTE : SciSpark can be built for multiple scala and spark versions. Currently the following combinations have been tested and working :

spark=1.6.0 scala=2.10.6
spark=2.0.0 scala=2.10.6
spark=2.0.0 scala=2.11.2

By default sbt clean assembly builds SciSpark for Spark 2.0.0 and Scala 2.11.2 (the latest versions). If you need to build the SciSpark jar for older versions you can specify the parameters like so :

sbt -Dspark.version=1.6.0 -Dscala.version=2.10.6 clean assembly

NB: Sometimes the build fails because of existing files in ~/.ivy2/cache. The usual culprit of such an error is nd4j. An example of such an error in the log is

(*:update) sbt.ResolveException: download failed: org.nd4j#nd4j-native;0.5.0!nd4j-native.jar

If this happens, try the following instructions.

cd ~/.ivy2/cache
rm -r org.nd4j/

then retry the build. See SBT documentation for more on this.

If you need to build Apache Spark for Scala 2.10, run the following commands from your Spark folder:

   ./dev/change-scala-version.sh 2.10
mvn -Pyarn -Phadoop-2.4 -Dscala-2.10 -DskipTests clean package