hadoop相关环境搭建

编程

文章内容输出来源:拉勾教育Java高薪训练营

集群规划

计划在3台虚拟机安装hadoop、hbase、hive,192.168.3.24,192.168.3.7,192.168.3.8

hosts配置

/etc/hosts

192.168.3.24 centos1

192.168.3.7 centos2

192.168.3.8 centos3

环境变量

首先需要安装jdk,并配置环境变量。

export JAVA_HOME=/opt/soft/jdk1.8.0_45

export PATH=$PATH:$JAVA_HOME/bin:/opt/soft/apache-hive-2.3.6-bin/bin:/opt/soft/hadoop-2.9.2/bin

hadoop环境

安装hadoop

配置文件,在/opt/soft/hadoop-2.9.2/etc/hadoop目录下,三台机子要修改成一样

core-site.xml

<configuration>

<property>

<name>hadoop.tmp.dir</name>

<value>file:/usr/local/hadoop/tmp</value>

<description>Abase for other temporary directories.</description>

</property>

<property>

<name>fs.defaultFS</name>

<value>hdfs://centos1:9000</value>

</property>

</configuration>

hdfs-site.xml

<property>

<name>dfs.name.dir</name>

<value>/usr/local/hadoop/hdfs/name</value>

</property>

<property>

<name>dfs.data.dir</name>

<value>/usr/local/hadoop/hdfs/data</value>

</property>

hadoop-env.sh

在最后一行加入下面语句

export JAVA_HOME=/opt/soft/jdk1.8.0_45

mapred-site.xml

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

<property>

<name>mapred.job.tracker</name>

<value>http://centos1:9001</value>

</property>

</configuration>

yarn-site.xml

<configuration>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.resourcemanager.hostname</name>

<value>centos1</value>

</property>

</configuration>

slaves

centos1

centos2

centos3

启动hadoop

初始化

在centos1 master上执行命令:

/opt/soft/hadoop-2.9.2/bin/hdfs namenode -format

启动

/opt/soft/hadoop-2.9.2/sbin/start-all.sh

jps

代表每台机的进程

centos1:namenode、ResourceManager、secondarynamenode

centos2:datanode、nodeManager

centos3:datanode、nodeManager

验证

hive环境

hive依赖mysql,请确保先安装mysql。

安装hive

配置hive-env.sh

JAVA_HOME=/opt/soft/jdk1.8.0_45

HADOOP_HOME=/opt/soft/hadoop-2.9.2

HIVE_HOME=/opt/soft/apache-hive-2.3.6-bin

export HIVE_CONF_DIR=${HIVE_HOME}/conf

配置hive-site.xml

注意,如果要开启可视化hive web环境,那么不要随便修改这个配置文件!!!

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

<name>javax.jdo.option.ConnectionURL</name>

<value>jdbc:mysql://centos1:3306/hive?createDatabaseIfNotExist=true</value>

<description>aJDBC connect string for a JDBC metastore</description>

</property>

<property>

<name>javax.jdo.option.ConnectionDriverName</name>

<value>com.mysql.jdbc.Driver</value>

<description>Driver class name for a JDBC metastore</description>

</property>

<property>

<name>javax.jdo.option.ConnectionUserName</name>

<value>root</value>

<description>username to use against metastore database</description>

</property>

<property>

<name>javax.jdo.option.ConnectionPassword</name>

<value>6342180</value>

<description>password to use against metastore database</description>

</property>

<property>

<name>datanucleus.autoCreateSchema</name>

<value>true</value>

</property>

<property>

<name>datanucleus.autoCreateTables</name>

<value>true</value>

</property>

<property>

<name>datanucleus.autoCreateColumns</name>

<value>true</value>

</property>

<!-- 设置 hive仓库的HDFS上的位置 -->

<property>

<name>hive.metastore.warehouse.dir</name>

<value>/usr/hive/warehouse</value>

<description>location of default database for the warehouse</description>

</property>

<property>

<name>hive.hwi.listen.host</name>

<value>centos1</value>

<description>This is the host address the Hive Web Interface will listen on</description>

</property>

<property>

<name>hive.hwi.listen.port</name>

<value>9999</value>

<description>This is the port the Hive Web Interface will listen on</description>

</property>

<property>

<name>hive.server2.thrift.bind.host</name>

<value>centos1</value>

</property>

<property>

<name>hive.server2.thrift.port</name>

<value>10000</value>

</property>

<property>

<name>hive.server2.thrift.http.port</name>

<value>10001</value>

</property>

<property>

<name>hive.server2.thrift.http.path</name>

<value>cliservice</value>

</property>

<!-- HiveServer2的WEB UI -->

<property>

<name>hive.server2.webui.host</name>

<value>centos1</value>

</property>

<property>

<name>hive.server2.webui.port</name>

<value>10002</value>

</property>

<property>

<name>hive.scratch.dir.permission</name>

<value>755</value>

</property>

</configuration>

初始化

/opt/soft/apache-hive-2.3.6-bin/bin/schematool -dbType mysql -initSchema

启动

/opt/soft/apache-hive-2.3.6-bin/bin/hive --service metastore

/opt/soft/apache-hive-2.3.6-bin/bin/hiveserver2

第一句可能不用,待实验

验证

docker安装zookeeper集群

docker-compose.yaml

注意三台机的ZOO_MY_ID是不一样的

version: "2"

networks:

zk:

services:

zookeeper1:

image: zookeeper:3.4

container_name: zk1.cloud

network_mode: host

ports:

- "2181:2181"

- "2888:2888"

- "3888:3888"

volumes:

- /opt/soft/zookeeper/conf:/conf

- /opt/soft/zookeeper/datalog:/datalog

environment:

ZOO_MY_ID: 1

zoo.cfg

这个文件在/opt/soft/zookeeper/conf目录下,三台机子是一模一样的配置。

一般docker就是配置一个文件映射,zk映射网络太麻烦了,上面docker-compose.yaml

配置了使用本机端口,方便做集群。

clientPort=2181

dataDir=/data

dataLogDir=/datalog

tickTime=2000

initLimit=5

syncLimit=2

autopurge.snapRetainCount=3

autopurge.purgeInterval=0

maxClientCnxns=60

server.1=centos1:2888:3888

server.2=centos2:2888:3888

server.3=centos3:2888:3888

常见操作

cd /opt/soft/zookeeper

启动

docker-compose up -d

关闭

docker-compose down

验证

docker exec -it zk1.cloud zkServer.sh status

安装hbase

首先,需要拷贝配置文件

cp /opt/soft/hadoop-2.9.2/etc/hadoop/hdfs-site.xml /opt/soft/hbase/hbase-1.3.1/conf

cp /opt/soft/hadoop-2.9.2/etc/hadoop/core-site.xml /opt/soft/hbase/hbase-1.3.1/conf

修改hbase-env.sh

export JAVA_HOME=/opt/module/jdk1.8.0_231

export HBASE_MANAGES_ZK=FALSE

修改regionservers文件

centos1

centos2

centos3

修改hbase-site.xml

<configuration>

<!-- 指定hbase在HDFS上存储的路径 -->

<property>

<name>hbase.rootdir</name>

<value>hdfs://centos1:9000/hbase</value>

</property>

<!-- 指定hbase是分布式的 -->

<property>

<name>hbase.cluster.distributed</name>

<value>true</value>

</property>

<!-- 指定zk的地址,多个用“,”分割 -->

<property>

<name>hbase.zookeeper.quorum</name>

<value>centos1:2181,centos2:2181,centos3:2181</value>

</property>

</configuration>

分发hbase

scp -r /opt/soft/hbase cenots2:/opt/soft

scp -r /opt/soft/hbase cenots3:/opt/soft

启动hbase

sh /opt/soft/hbase/hbase-1.3.1/bin/start-hbase.sh

验证

结束语: 本篇文章是对拉勾高薪训练营学习过程中,Hadoop模块的一部分总结与复习。 (PS:这个训练营不仅学到了知识,还能认识了一些优秀的朋友,集美丽与智慧于一身的木槿老师,温柔而不失严厉与负责的班班)

顺便给自己打个相亲广告:卢学霸,性别男,爱好女。(以下省略1w字)

以上是 hadoop相关环境搭建 的全部内容, 来源链接: utcz.com/z/518279.html

回到顶部