First things first, got to have latest Java
1 2 3 | $ sudo add-apt-repository ppa:webupd8team/java $ sudo apt-get update $ sudo apt-get install oracle-java8-installer |
Next install Scala. The latest version as of today is 2.11.7, but I ran into brick wall with that version. So using 2.10.6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | $ wget http://www.scala-lang.org/files/archive/scala-2.10.6.deb $ sudo dpkg -i scala-2.10.6.deb $ scala Welcome to Scala version 2.10.6 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_66). Type in expressions to have them evaluated. Type :help for more information. scala> :q $ sudo apt-get install git $ wget http://d3kbcqa49mib13.cloudfront.net/spark-1.6.0.tgz $ tar xvf spark-1.6.0.tgz $ rm spark-1.6.0.tgz $ cd spark-1.6.0/ $ build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.4 -DskipTests clean package |
Notes:
:q is to quit from scala shell
Spark does not yet support its JDBC component for Scala 2.11.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | ./run-example SparkPi Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/02/13 06:56:35 INFO SparkContext: Running Spark version 1.6.0 16/02/13 06:56:35 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/02/13 06:56:35 WARN Utils: Your hostname, .... resolves to a loopback address: 127.0.1.1; using 192.168.1.140 instead (on interface wlan0) 16/02/13 06:56:35 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 16/02/13 06:56:35 INFO SecurityManager: Changing view acls to: ... 16/02/13 06:56:35 INFO SecurityManager: Changing modify acls to: .... 16/02/13 06:56:35 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(...); users with modify permissions: Set(...) 16/02/13 06:56:36 INFO Utils: Successfully started service 'sparkDriver' on port 34966. 16/02/13 06:56:36 INFO Slf4jLogger: Slf4jLogger started 16/02/13 06:56:36 INFO Remoting: Starting remoting 16/02/13 06:56:36 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.1.140:49553] 16/02/13 06:56:36 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 49553. 16/02/13 06:56:36 INFO SparkEnv: Registering MapOutputTracker 16/02/13 06:56:36 INFO SparkEnv: Registering BlockManagerMaster 16/02/13 06:56:36 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-1e81dc84-91ff-4503-ab3e-7fa8adbac78e 16/02/13 06:56:36 INFO MemoryStore: MemoryStore started with capacity 511.1 MB 16/02/13 06:56:36 INFO SparkEnv: Registering OutputCommitCoordinator 16/02/13 06:56:36 INFO Utils: Successfully started service 'SparkUI' on port 4040. 16/02/13 06:56:36 INFO SparkUI: Started SparkUI at http://192.168.1.140:4040 16/02/13 06:56:36 INFO HttpFileServer: HTTP File server directory is /tmp/spark-29f5d77e-a0ea-4986-95fb-6d3b9104c18f/httpd-6774316e-0973-43c7-b58d-3cc0a1991f95 16/02/13 06:56:36 INFO HttpServer: Starting HTTP Server 16/02/13 06:56:36 INFO Utils: Successfully started service 'HTTP file server' on port 42625. 16/02/13 06:56:37 INFO SparkContext: Added JAR file:/home/.../spark/spark-1.6.0/examples/target/scala-2.10/spark-examples-1.6.0-hadoop2.6.4.jar at http://192.168.1.140:42625/jars/spark-examples-1.6.0-hadoop2.6.4.jar with timestamp 1455368197141 16/02/13 06:56:37 INFO Executor: Starting executor ID driver on host localhost 16/02/13 06:56:37 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 58239. 16/02/13 06:56:37 INFO NettyBlockTransferService: Server created on 58239 16/02/13 06:56:37 INFO BlockManagerMaster: Trying to register BlockManager 16/02/13 06:56:37 INFO BlockManagerMasterEndpoint: Registering block manager localhost:58239 with 511.1 MB RAM, BlockManagerId(driver, localhost, 58239) 16/02/13 06:56:37 INFO BlockManagerMaster: Registered BlockManager 16/02/13 06:56:37 INFO SparkContext: Starting job: reduce at SparkPi.scala:36 16/02/13 06:56:37 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:36) with 2 output partitions 16/02/13 06:56:37 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:36) 16/02/13 06:56:37 INFO DAGScheduler: Parents of final stage: List() 16/02/13 06:56:37 INFO DAGScheduler: Missing parents: List() 16/02/13 06:56:37 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:32), which has no missing parents 16/02/13 06:56:38 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1888.0 B, free 1888.0 B) 16/02/13 06:56:38 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1202.0 B, free 3.0 KB) 16/02/13 06:56:38 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:58239 (size: 1202.0 B, free: 511.1 MB) 16/02/13 06:56:38 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006 16/02/13 06:56:38 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:32) 16/02/13 06:56:38 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks 16/02/13 06:56:38 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 2156 bytes) 16/02/13 06:56:38 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, partition 1,PROCESS_LOCAL, 2156 bytes) 16/02/13 06:56:38 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 16/02/13 06:56:38 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) 16/02/13 06:56:38 INFO Executor: Fetching http://192.168.1.140:42625/jars/spark-examples-1.6.0-hadoop2.6.4.jar with timestamp 1455368197141 16/02/13 06:56:38 INFO Utils: Fetching http://192.168.1.140:42625/jars/spark-examples-1.6.0-hadoop2.6.4.jar to /tmp/spark-29f5d77e-a0ea-4986-95fb-6d3b9104c18f/userFiles-8a089d63-89cb-4bf4-a116-b1dcfb15e6c1/fetchFileTemp3377846363451941090.tmp 16/02/13 06:56:38 INFO Executor: Adding file:/tmp/spark-29f5d77e-a0ea-4986-95fb-6d3b9104c18f/userFiles-8a089d63-89cb-4bf4-a116-b1dcfb15e6c1/spark-examples-1.6.0-hadoop2.6.4.jar to class loader 16/02/13 06:56:38 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 1031 bytes result sent to driver 16/02/13 06:56:38 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1031 bytes result sent to driver 16/02/13 06:56:38 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 762 ms on localhost (1/2) 16/02/13 06:56:38 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 736 ms on localhost (2/2) 16/02/13 06:56:38 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 16/02/13 06:56:38 INFO DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:36) finished in 0.777 s 16/02/13 06:56:38 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:36, took 1.003757 s Pi is roughly 3.13756 <<-- There it is! 16/02/13 06:56:38 INFO SparkUI: Stopped Spark web UI at http://192.168.1.140:4040 16/02/13 06:56:38 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 16/02/13 06:56:38 INFO MemoryStore: MemoryStore cleared 16/02/13 06:56:38 INFO BlockManager: BlockManager stopped 16/02/13 06:56:38 INFO BlockManagerMaster: BlockManagerMaster stopped 16/02/13 06:56:38 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/02/13 06:56:38 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 16/02/13 06:56:38 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 16/02/13 06:56:38 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down. 16/02/13 06:56:39 INFO SparkContext: Successfully stopped SparkContext 16/02/13 06:56:39 INFO ShutdownHookManager: Shutdown hook called 16/02/13 06:56:39 INFO ShutdownHookManager: Deleting directory /tmp/spark-29f5d77e-a0ea-4986-95fb-6d3b9104c18f 16/02/13 06:56:39 INFO ShutdownHookManager: Deleting directory /tmp/spark-29f5d77e-a0ea-4986-95fb-6d3b9104c18f/httpd-6774316e-0973-43c7-b58d-3cc0a1991f95 |
Install SBT. The most current version is 0.13.9
1 2 3 4 | echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 642AC823 sudo apt-get update sudo apt-get install sbt |