Thursday, September 8, 2016

Install Spark on Mac

Minor Edit from http://genomegeek.blogspot.com/2014/11/how-to-install-apache-spark-on-mac-os-x.html


  • Install Java
    - Download Oracle Java SE Development Kit 7 or 8 at Oracle JDK downloads page.
    - Double click on .dmg file to start the installation
    - Open up the terminal.
    - Type java -version, should display the following 

    java version "1.7.0_71" 
    Java(TM) SE Runtime Environment (build 1.7.0_71-b14) 
    Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)  
  • Set JAVA_HOME 
export JAVA_HOME=$(/usr/libexec/java_home) 

  • Install Homebrew 
  • Install Scala
brew install scala 
  • Set SCALA_HOME 
export SCALA_HOME=/usr/local/bin/scala  
export PATH=$PATH:$SCALA_HOME/bin 

  • Install Spark 
sudo chown -R $(whoami) /usr/local
brew install apache-spark 
  • Add $SPARK_HOME/python/build to PYTHONPATH 
export SPARK_HOME=/usr/local/Cellar/apache-spark/2.0.0
export PYTHONPATH=$SPARK_HOME/libexec/python:$SPARK_HOME/libexec/python/build:$PYTHONPATH

  • Fire up the Spark 
For the Scala shell: 
./bin/spark-shell
For the Python shell: 
./bin/pyspark 
  • Run Examples 
Calculate Pi: 
./bin/run-example org.apache.spark.examples.SparkPi 
MLlib Correlations example: 
./bin/run-example org.apache.spark.examples.mllib.Correlations 
MLlib Linear Regression example: 
./bin/spark-submit 
--class org.apache.spark.examples.mllib.LinearRegression 
examples/target/scala-*/spark-*.jar data/mllib/sample_linear_regression_data.txt  

References: 
How to install Spark on Mac OS X 
How To Set $JAVA_HOME Environment Variable On Mac OS X 
Homebrew - The missing package manager for OS X 

No comments: