hadoop相关环境搭建
文章内容输出来源:拉勾教育Java高薪训练营
集群规划
计划在3台虚拟机安装hadoop、hbase、hive,192.168.3.24,192.168.3.7,192.168.3.8
hosts配置
/etc/hosts
192.168.3.24 centos1
192.168.3.7 centos2
192.168.3.8 centos3
环境变量
首先需要安装jdk,并配置环境变量。
export JAVA_HOME=/opt/soft/jdk1.8.0_45
export PATH=$PATH:$JAVA_HOME/bin:/opt/soft/apache-hive-2.3.6-bin/bin:/opt/soft/hadoop-2.9.2/bin
hadoop环境
安装hadoop
配置文件,在/opt/soft/hadoop-2.9.2/etc/hadoop目录下,三台机子要修改成一样
core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://centos1:9000</value>
</property>
</configuration>
hdfs-site.xml
<property>
<name>dfs.name.dir</name>
<value>/usr/local/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/local/hadoop/hdfs/data</value>
</property>
hadoop-env.sh
在最后一行加入下面语句
export JAVA_HOME=/opt/soft/jdk1.8.0_45
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>http://centos1:9001</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>centos1</value>
</property>
</configuration>
slaves
centos1
centos2
centos3
启动hadoop
初始化
在centos1 master上执行命令:
/opt/soft/hadoop-2.9.2/bin/hdfs namenode -format
启动
/opt/soft/hadoop-2.9.2/sbin/start-all.sh
jps
代表每台机的进程
centos1:namenode、ResourceManager、secondarynamenode
centos2:datanode、nodeManager
centos3:datanode、nodeManager
验证
hive环境
hive依赖mysql,请确保先安装mysql。
安装hive
配置hive-env.sh
JAVA_HOME=/opt/soft/jdk1.8.0_45
HADOOP_HOME=/opt/soft/hadoop-2.9.2
HIVE_HOME=/opt/soft/apache-hive-2.3.6-bin
export HIVE_CONF_DIR=${HIVE_HOME}/conf
配置hive-site.xml
注意,如果要开启可视化hive web环境,那么不要随便修改这个配置文件!!!
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://centos1:3306/hive?createDatabaseIfNotExist=true</value>
<description>aJDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>6342180</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>true</value>
</property>
<property>
<name>datanucleus.autoCreateTables</name>
<value>true</value>
</property>
<property>
<name>datanucleus.autoCreateColumns</name>
<value>true</value>
</property>
<!-- 设置 hive仓库的HDFS上的位置 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/usr/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.hwi.listen.host</name>
<value>centos1</value>
<description>This is the host address the Hive Web Interface will listen on</description>
</property>
<property>
<name>hive.hwi.listen.port</name>
<value>9999</value>
<description>This is the port the Hive Web Interface will listen on</description>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>centos1</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.http.port</name>
<value>10001</value>
</property>
<property>
<name>hive.server2.thrift.http.path</name>
<value>cliservice</value>
</property>
<!-- HiveServer2的WEB UI -->
<property>
<name>hive.server2.webui.host</name>
<value>centos1</value>
</property>
<property>
<name>hive.server2.webui.port</name>
<value>10002</value>
</property>
<property>
<name>hive.scratch.dir.permission</name>
<value>755</value>
</property>
</configuration>
初始化
/opt/soft/apache-hive-2.3.6-bin/bin/schematool -dbType mysql -initSchema
启动
/opt/soft/apache-hive-2.3.6-bin/bin/hive --service metastore
/opt/soft/apache-hive-2.3.6-bin/bin/hiveserver2
第一句可能不用,待实验
验证
docker安装zookeeper集群
docker-compose.yaml
注意三台机的ZOO_MY_ID是不一样的
version: "2"
networks:
zk:
services:
zookeeper1:
image: zookeeper:3.4
container_name: zk1.cloud
network_mode: host
ports:
- "2181:2181"
- "2888:2888"
- "3888:3888"
volumes:
- /opt/soft/zookeeper/conf:/conf
- /opt/soft/zookeeper/datalog:/datalog
environment:
ZOO_MY_ID: 1
zoo.cfg
这个文件在/opt/soft/zookeeper/conf目录下,三台机子是一模一样的配置。
一般docker就是配置一个文件映射,zk映射网络太麻烦了,上面docker-compose.yaml
配置了使用本机端口,方便做集群。
clientPort=2181
dataDir=/data
dataLogDir=/datalog
tickTime=2000
initLimit=5
syncLimit=2
autopurge.snapRetainCount=3
autopurge.purgeInterval=0
maxClientCnxns=60
server.1=centos1:2888:3888
server.2=centos2:2888:3888
server.3=centos3:2888:3888
常见操作
cd /opt/soft/zookeeper
启动
docker-compose up -d
关闭
docker-compose down
验证
docker exec -it zk1.cloud zkServer.sh status
安装hbase
首先,需要拷贝配置文件
cp /opt/soft/hadoop-2.9.2/etc/hadoop/hdfs-site.xml /opt/soft/hbase/hbase-1.3.1/conf
cp /opt/soft/hadoop-2.9.2/etc/hadoop/core-site.xml /opt/soft/hbase/hbase-1.3.1/conf
修改hbase-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_231
export HBASE_MANAGES_ZK=FALSE
修改regionservers文件
centos1
centos2
centos3
修改hbase-site.xml
<configuration>
<!-- 指定hbase在HDFS上存储的路径 -->
<property>
<name>hbase.rootdir</name>
<value>hdfs://centos1:9000/hbase</value>
</property>
<!-- 指定hbase是分布式的 -->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<!-- 指定zk的地址,多个用“,”分割 -->
<property>
<name>hbase.zookeeper.quorum</name>
<value>centos1:2181,centos2:2181,centos3:2181</value>
</property>
</configuration>
分发hbase
scp -r /opt/soft/hbase cenots2:/opt/soft
scp -r /opt/soft/hbase cenots3:/opt/soft
启动hbase
sh /opt/soft/hbase/hbase-1.3.1/bin/start-hbase.sh
验证
结束语: 本篇文章是对拉勾高薪训练营学习过程中,Hadoop模块的一部分总结与复习。 (PS:这个训练营不仅学到了知识,还能认识了一些优秀的朋友,集美丽与智慧于一身的木槿老师,温柔而不失严厉与负责的班班)
顺便给自己打个相亲广告:卢学霸,性别男,爱好女。(以下省略1w字)
以上是 hadoop相关环境搭建 的全部内容, 来源链接: utcz.com/z/518279.html