sbt的安装测试

1.下载

wget https://github.com/sbt/sbt/releases/download/v0.13.15/sbt-0.13.15.tgz

2.安装

 tar -zxvf sbt-0.13.15.tgz -C /root/scala/sbt

3.在/root/scala/sbt目录下面添加文件sbt:

vim sbt

SBT_OPTS="-Xms512M -Xmx1536M -Xss1M -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=1024M"
java $SBT_OPTS -jar /root/scala/sbt/bin/sbt-launch.jar "$@"

4.修改sbt权限

$ chmod u+x sbt 

5.配置环境变量

[root@host ~]# vim  ~/.bashrc 

# 在文件尾部添加如下代码后,保存退出
export PATH=/root/scala/sbt/:$PATH

$ source ~/.bashrc
配置配置文件的目录在./sbt/conf/sbtconfig.txt
-Dhttp.proxyHost=proxy.zte.com.cn
-Dhttp.proxyPort=80

6.测试

[root@host sbt]# sbt sbt-version
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=1024M; support was removed in 8.0
WARN: No sbt.version set in project/build.properties, base directory: /root/scala/sbt
[warn] Executing in batch mode.
[warn] For better performance, hit [ENTER] to switch to interactive mode, or
[warn] consider launching sbt without any commands, or explicitly passing 'shell'
[info] Set current project to sbt (in build file:/root/scala/sbt/)
[info] 0.13.15

7.编写scala应用程序

  cd  # 进入用户主文件

 mkdir  sparkapp # 创建应用程序根目

 mkdir -p  sparkapp/src/main/scala # 创建所需的文件夹结构

创建 vim  sparkapp/src/main/scala/SimpleApp.scala

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SimpleApp {
def main(args: Array[String]) {
val logFile = "/tmp/20171024/20171024.txt" // 文件路径
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
logData.foreach(println)
}
}

8..使用sbt打包Scala程序

 

该程序依赖 Spark API,因此我们需要通过 sbt 进行编译打包。 请在sparkapp 中新建文件 simple.sbt(vim sparkapp/simple.sbt),添加内容如下,声明该独立应用程序的信息以及与 Spark 的依赖关系:

name := "Simple Project"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.0"

文件 simple.sbt 需要指明 Spark 和 Scala 的版本。在上面的配置信息中,scalaVersion用来指定scala的版本,sparkcore用来指定spark的版本,这两个版本信息都可以在之前的启动 Spark shell 的过程中,从屏幕的显示信息中找到。

我们就可以通过如下代码将整个应用程序打包成 JAR:

[root@host ~]# cd sparkapp
[root@host sparkapp]# /root/scala/sbt/sbt package
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=1024M; support was removed in 8.0
[warn] Executing in batch mode.
[warn] For better performance, hit [ENTER] to switch to interactive mode, or
[warn] consider launching sbt without any commands, or explicitly passing 'shell'
[info] Loading project definition from /root/sparkapp/project
[info] Set current project to Simple Project (in build file:/root/sparkapp/)
[info] Compiling 1 Scala source to /root/sparkapp/target/scala-2.11/classes...
[info] Packaging /root/sparkapp/target/scala-2.11/simple-project_2.11-1.0.jar ...
[info] Done packaging.
[success] Total time: 4 s, completed Nov 6, 2017 11:23:49 AM

9.通过 spark-submit 运行程序(红色字体为结果

[root@host sparkapp]# $SPARK_HOME/bin/spark-submit --class "SimpleApp" /root/sparkapp/target/scala-2.11/simple-project_2.11-1.0.jar
17/11/06 11:24:00 INFO spark.SparkContext: Running Spark version 2.2.0
.................................................
17/11/06 11:24:03 INFO rdd.HadoopRDD: Input split: hdfs://localhost:9000/tmp/20171024/20171024.txt:23+23
17/11/06 11:24:03 INFO rdd.HadoopRDD: Input split: hdfs://localhost:9000/tmp/20171024/20171024.txt:0+23
17/11/06 11:24:04 INFO memory.MemoryStore: Block rdd_1_1 stored as values in memory (estimated size 16.0 B, free 366.0 MB)
17/11/06 11:24:04 INFO memory.MemoryStore: Block rdd_1_0 stored as values in memory (estimated size 160.0 B, free 366.0 MB)
17/11/06 11:24:04 INFO storage.BlockManagerInfo: Added rdd_1_1 in memory on 192.168.120.140:56693 (size: 16.0 B, free: 366.3 MB)
17/11/06 11:24:04 INFO storage.BlockManagerInfo: Added rdd_1_0 in memory on 192.168.120.140:56693 (size: 160.0 B, free: 366.3 MB)
10 20 30 50 80 100 60 90 60 60 31 80 70 51 50
.......................................
17/11/06 11:24:04 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-dcdfa23e-2972-49d0-9e1d-9643d5f25eb4