站浏览量 站访问人数
目录
  1. 1. 安装git
  2. 2. 配置jdk
  3. 3. 安装eclipse
  4. 4. sbt安装
  5. 5. maven安装
  6. 6. hadoop配置
  7. 7. hbase安装
  8. 8. python安装

开发一般在linux下开发方便些,然而在windows下有更多的工具可以使用,也更用户便捷性,因此,在windows下搭建配置开发环境很是必要,这里是工作中的一些经验总结,把环境配置在windows下,可以跑python、spark等程序。前后可能连接不顺畅,主要是想到哪里就记录下,另外也是一点一点积累。

安装git

下载Git,在windows下生成自己的ssh_key,在目录下/c/Users/your conputer name/.ssh,

1
id_rsa  id_rsa.pub  known_hosts  known_hosts.old

接着把id_rsa.pub里的key字串拷贝到gitlab里,git库更改,则使用如下命令

1
ssh_keygen -f "/***/****/.ssh/known_hosts" -R 库地址,主域名

windows的域名解析,在/c/Windows/System32/drivers/etc/里有个文件hosts,其中把域名和ip以空格每行写入即可。注意的是,需要管理员权限,如果没有,则自己把用户的权限提升,并且赋给该用户有写入的权限。

配置jdk

开发环境离不开jdk的配置,在windows下配置jdk,步骤如下:

1
2
3
4
5
6
7
8
1,下载jdk,以jdk1.8.0_101为例,首先从jdk的官网下载;

2,配置环境变量,
JAVA_HOME,D:\programs\Java\jdk1.8.0_101
CLASSPATH,%JAVA_HOME%\lib;%JAVA_HOME%\lib\tools.jar
Path,D:\programs\Java\jdk1.8.0_101\bin

3,在cmd下测试,java -version

在linux下的环境变量配置,文件/etc/profile

1
2
3
4
export JAVA_HOME=/home/min/programs/jdk/jdk1.8.0_74
export PATH=$JAVA_HOME/bin:$PATH
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=./:$JAVA_HOME/lib:$JAVA_HOME/jre/lib

安装eclipse

在eclipse的官网下载,Eclipse IDE for Java Developers Version: Neon Release (4.6.0),然后安装插件,

1
2
3
help->install new software
add:scala http://download.scala-ide.org/sdk/lithium/e44/scala211/stable/site
add:python http://pydev.org/updates

然后,静静地等待安装完。

sbt安装

在windows下线下载sbt包,然后配置路径,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1,sbt/conf/repo.properties是sbt读取开源包的路径
[repositories]
local
//comp-maven: http://repo.data.1verge.net/nexus/content/groups/public/
// store_cn: http://maven.oschina.net/content/groups/public/
//store_mir: http://mirrors.ibiblio.org/maven2/
// store_0: http://maven.net.cn/content/groups/public/
store_1: http://repo.typesafe.com/typesafe/ivy-releases/
//store_2: http://repo2.maven.org/maven2/
sbt-releases-repo: http://repo.typesafe.com/typesafe/ivy-releases/, [organization]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)[revision]/[type]s/[artifact](-[classifier]).[ext]
sbt-plugins-repo: http://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/, [organization]/[module]/(scala_[scalaVersion]/)(sbt_[sbtVersion]/)[revision]/[type]s/[artifact](-[classifier]).[ext]
maven-central: http://repo1.maven.org/maven2/
2,sbt/conf/sbtconfig.txt是sbt下载包存放的路径
# Set the java args to high
-Xmx512M

-XX:MaxPermSize=256m

-XX:ReservedCodeCacheSize=128m



# Set the extra SBT options

-Dsbt.log.format=true

-Dsbt.ivy.home=D:/programs/sbt/.ivy2
-Dsbt.global.base=D:/programs/sbt/.sbt
-Dsbt.repository.config=D:/programs/sbt/conf/repo.properties

3,环境变量配置,PATH D:\programs\sbt\bin

在linux下,在/bin/目录下创建/bin/sbt,并且内容配置如下,

1
2
SBT_OPTS="-Xms512M -Xmx1536M -Xss1M -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=256M -Dsbt.override.build.repos=true"
java $SBT_OPTS -jar `dirname $0`/sbt-launch.jar "$@"

在$HOME/.sbt/repositories写入sbt包下载的路径

1
2
3
4
5
6
7
[repositories]
local
my-flinkspector:https://dl.bintray.com/ottogroup/maven/
my-sbt-releases:https://repository.apache.org/content/repositories/releases/
my-sbt-public:https://repository.apache.org/content/repositories/public/
my-maven-public:http://repo.maven.apache.org/maven2/
my-maven-proxy-releases: http://maven.nlpcn.org/

maven安装

下载maven的安装包,然后配置环境变量,

1
2
3
4
5
6
在linux下,/etc/profile
export MAVEN_HOME=/home/min/programs/maven/maven3.0.4
export PATH=$MAVEN_HOME/bin:$PATH

在windows下,
PATH D:\programs\maven-3.0.4\bin

hadoop配置

在windows下,hadoop的环境可以使用第三方已经搭配好的试试,下载hadoop的包

1
2
3
4
5
6
hadoop-common-2.2.0-bin/bin:
hadoop,hadoop.cmd,hadoop.dll,hadoop.exp,hadoop.lib,hadoop.pdb,windows.exe等

配置环境变量,
HADOOP_HOME, E:\workspace\native-hadoop-bin\hadoop-common-2.2.0-bin
Path,$HADOOP_HOME\bin

在linux下配置hadoop的步骤,

1,ssh 保证可连自己机器,可选操作,只是需要输入自己机器密码而已

2,下载hadoop-2.2.0版本,

3,配置,都是在hadoop-2.2.0/etc/hadoop下进行,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/home/min/programs/hadoop/hadoop-2.2.0/tmp</value>
</property>
</configuration>

hadoop-env.sh

# The java implementation to use.
export JAVA_HOME=/home/min/programs/jdk/jdk1.8.0_74


hdfs-site.xml

<configuration>

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.name.dir</name>
<value>/home/min/programs/hadoop/hadoop-2.2.0/dfs/name</value>
</property>

<property>
<name>dfs.data.dir</name>
<value>/home/min/programs/hadoop/hadoop-2.2.0/dfs/data</value>
</property>
</configuration>


mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>

</configuration>

3,启动hadoop,

1
2
3
4
5
6
7
cd /***/hadoop-2.2.0/bin

./hadoop namenode -format

cd /***/hadoop-2.2.0/sbin

./start-all.sh

hbase安装

1,下载hbase-1.0.3版本;

2,配置,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
hbase-site.xml

<configuration>
<property>
<name>hbase.rootdir</name>
<value>/home/min/programs/hbase-1.0.3/hbaseData</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>127.0.0.1</value>
</property>
<property>
<name>hbase.rest.port</name>
<value>8080</value>
</property>
<property>
<name>hbase.rest.readonly</name>
<value>true</value>
</property>
<property>
<name>hbase.rest.authentication.type</name>
<value>kerberos</value>
</property>
<property>
<name>hbase.rest.authentication.kerberos.principal</name>
<value>HTTP/_HOST@HADOOP.LOCALDOMAIN</value>
</property>
<property>
<name>hbase.rest.authentication.kerberos.keytab</name>
<value>$KEYTAB</value>
</property>
</configuration>


hbase-env.sh

# The java implementation to use. Java 1.7+ required.
export JAVA_HOME=/home/min/programs/jdk/jdk1.8.0_74

3,启动hbase

1
2
cd /**/hbase-1.0.3/bin
./start-hbase.sh

python安装

在windows下安装python,只需在官网下载包安装即可,然后配置环境变量即可,

1
Path, D:\programs\python35

python安装额外的包,都可以用pip安装,

1
2
3
4
5
6
7
8
9
10
11
12
进入到D:\programs\python35\Scripts

pip install requests
pip install beautifulsoup4
pip install jieba
pip install py4j -i https://pypi.douban.com/simple
pip install hmmlearn

将\spark-1.6.0-bin-hadoop2.6\python下的pySpark文件拷贝到python35\Lib\site-packages\
在windowd下跑spark的python程序需要指定spark本地目录,
import os
os.environ["SPARK_HOME"]="E:\programs\spark\spark-2.0.0-bin-hadoop2.6"