greenplum安装参考
OS:rhel7.5 x86_64
DB:GreenPlum db 6.x
下载源码包安装
gp download
安装前参考最佳实践对操作系统进行优化!!
greenplum or hgdw Best practices
1 pretasks
修改主机名
cat /etc/hosts127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
# gp configure
192.168.80.146 gp1
192.168.80.147 gp2
192.168.80.148 gp3
192.168.80.149 gp4
创建操作系统用户
使用root用户登录主节点主机系统,在主节点创建数据库用gpadmin户,切换到新用户运行运行数据仓库安装包。
groupadd -g 1001 gpadminuseradd -u 1001 -g gpadmin gpadmin; echo gpadmin | passwd -f --stdin gpadmin
2 install
安装数据仓库软件
gp4.x/gp5.x 以前安装分为三部分
1. 安装master(一般是个bin的可执行文件,安装,并可以指定安装目录)
2. gpseginstall 安装各个seg
3. gp群参数校验
4. gpinitsystem 集群初始化
gp6.2 开始不提供zip 格式压缩包,仅提供rpm包,安装分为两步
1. 安装master(rpm -ivh / yum install -y),不可以指定安装目录,默认安装到/usr/local/
2. gp6 没有 gpseginstall工具。所以要么自己打包master 安装好的gp目录并传到seg上,要么各个节点单独yum 安装。
1.每个节点主机,单独yum
2.打包主节点的安装目录,并分发给seg主机。
3. 集群性能校验
4. gpinitsystem 集群初始化
先只在主节点安装软件
rpm -ivh xxx.rpm
环境变量
添加下面初始化变量到gpadmin用户.bash_profile文件,使立即生效:source .bash_profile:
cat ~/.bash_profilesource /opt/gp/greenplum/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/hgdata/master/hgseg-1
export PGPORT=2345
export PGUSER=hgadmin
export PGDATABASE=gpdb
greenplum_path.sh — 该文件包含数据仓库的基础环境变量。默认自动生成
创建hostfile
创建几个host文件,用于后续使用gpssh,gpscp 等脚本host参数文件
hostfile_exkeys 内容是集群所有主机名或ip,用于ssh互信配置
hostfile_all 内容是集群所有主机名或ip,包含master,segment,standby等
hostfile_seg 内容是所有 segment主机名或ip
cat hostfile_exkeyshgdw1
hgdw2
hgdw3
hgdw4
hgdw5
hgdw6
cat hostfile_all
hgdw1
hgdw2
hgdw3
hgdw4
hgdw5
hgdw6
cat hostfile_seg
hgdw2
hgdw3
hgdw4
hgdw5
hgdw6
配置节点互信
配置互信
root用户和gpadin用户都需要做
1. 主机生成密钥
ssh-keygen -t rsa
2. 将公钥复制到各个节点机器的authorized_keys文件中
ssh-copy-id -i id_rsa.pub root@hgdw1
ssh-copy-id -i id_rsa.pub root@hgdw3
ssh-copy-id -i id_rsa.pub root@hgdw3
ssh-copy-id -i id_rsa.pub root@hgdw4
ssh-copy-id -i id_rsa.pub root@hgdw4
ssh-copy-id -i id_rsa.pub root@hgdw6
3. 使用gpssh-exkeys 工具,打通n-n的免密登陆
root和gpadmin用户都执行一遍
gpssh-exkeys -f /home/gpadmin/hostfile_exkeys
hgssh-exkeys -f /home/hgadmin/hostfile_exkeys
[STEP 1 of 5] create local ID and authorize on local host
... /root/.ssh/id_rsa file exists ... key generation skipped
[STEP 2 of 5] keyscan all hosts and update known_hosts file
[STEP 3 of 5] retrieving credentials from remote hosts
... send to hgdw2
... send to hgdw3
... send to hgdw4
... send to hgdw5
... send to hgdw6
[STEP 4 of 5] determine common authentication file content
[STEP 5 of 5] copy authentication files to all remote hosts
... finished key exchange with hgdw2
... finished key exchange with hgdw3
... finished key exchange with hgdw4
... finished key exchange with hgdw5
... finished key exchange with hgdw6
[INFO] completed successfully
验证
hgssh -f /home/hgadmin/hostfile_all -e "date"
[hgdw1] date
[hgdw1] Wed Apr 1 15:47:36 CST 2020
[hgdw3] date
[hgdw3] Wed Apr 1 15:47:36 CST 2020
[hgdw4] date
[hgdw4] Wed Apr 1 15:47:35 CST 2020
[hgdw6] date
[hgdw6] Wed Apr 1 15:47:35 CST 2020
[hgdw5] date
[hgdw5] Wed Apr 1 15:47:36 CST 2020
[hgdw2] date
[hgdw2] Wed Apr 1 15:47:36 CST 2020
同步个节点环境变量
gpscp -f /home/hgadmin/hostfile_seg /home/hgadmin/.bash_profile hgadmin@=:/home/hgadmin/.bash_profile
在所有计算节点上安装配置数据仓库
rootsource /usr/local/hgdw/hgdw_path.sh
hgssh -f /home/hgadmin/hostfile_seg -e "mkdir -p /opt/software"
hgssh -f /home/hgadmin/hostfile_seg -e "chmod 777 /opt/software"
-greenpllum
gpscp -f /home/hgadmin/hostfile_seg /opt/software/greenplum-db-6.5.0-rhel7-x86_64.rpm hgadmin@=:/opt/software/
su - root
source /usr/local/greenplum-db/greenplum_path.sh
gpssh -f /home/hgadmin/hostfile_seg -e "yum install apr -y"
gpssh -f /home/hgadmin/hostfile_seg -e "yum install apr-util -y"
gpssh -f /home/hgadmin/hostfile_seg -e "rpm -ivh /opt/software/greenplum-db-6.5.0-rhel7-x86_64.rpm"
gpssh -f /home/hgadmin/hostfile_seg -e "chown hgadmin:hgadmin /usr/local/greenplum* -R"
-hgdw
gpscp -f /home/hgadmin/hostfile_seg /opt/software/hgdw-3.0.zip hgadmin@=:/opt/software/
hgssh -f /home/hgadmin/hostfile_seg -e "unzip -d /usr/local /opt/software/hgdw-3.0.zip"
hgssh -f /home/hgadmin/hostfile_seg -e "chown hgadmin:hgadmin /usr/local/hgdw -R"
创建数据存储区域
创建或者选择一个用于主节点的数据存储区域。该目录需要有足够的磁盘空间并确保归属于gpadmin用户和组。
hgadmin
创建master 数据目录
mkdir -p /data/hgdata/master
创建segment 数据目录
本次计划每个主机安装2个 segment
hgssh -f /home/hgadmin/hostfile_seg -e "mkdir -p /data/hgdata/hgsegp"
hgssh -f /home/hgadmin/hostfile_seg -e "mkdir -p /data/hgdata/hgsegm"
同步系统时钟
在计算节点主机上,NTP应该配置主节点主机作为主时间源,而备用主节点主机作为备选时源。在主节点和备用主节点上配置NTP到用户首选的时间服务器。
配置NTP
安装相应软件包:yum install ntp -y
检测是否安装成功:
rpm -qa | grep ntp
ntpdate-4.2.6p5-25.el7.centos.x86_64
ntp-4.2.6p5-25.el7.centos.x86_64
开机自启动
systemctl start ntpd.service
systemctl enable ntpd.service
systemctl enable ntpdate.service
查看状态
systemctl status ntpd.service
主节点配置
vi /etc/ntp.conf
server 127.127.1.0 prefer
127.127.1.0 stratum 8
$ ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*LOCAL(0) .LOCL. 5 l 43 64 377 0.000 0.000 0.000
备用主节点
vi /etc/ntp.conf
server gp1 prefer
计算节点
vi /etc/ntp.conf
server gp1 prefer
测试
ntpq -p
检查系统环境
瀚高数据仓库提供命令用以检查系统的配置和性能,这些命令可以在瀚高数据仓库安装的$GPHOME/bin目录下找到。初始化瀚高数据仓库系统之前,应执行硬件性能检查。
瀚高数据仓库提供的gpcheckperf命令可用来在瀚高数据仓库集群主机上检查硬件和系统级gpcheckperf在指定的主机上启动一个会话并执行下面的性能测试:
检查网络性能
检查磁盘I/O性能
检查内存带宽
在使用gpcheckperf之前,必须已经在所有相关需要做性能测试的主机之间建立了互信。如果还没有做,可以使用gpssh-exkeys命令来建立互信。gpcheckperf会调用gpssh和gpscp,所以这些命令必须已经存在在$PATH中。
使用gpadmin用户1.检查网络性能
gpcheckperf -f /home/gpadmin/hostfile_gpcheckperfnet -r N -d /tmp > subnet1.out
/opt/gp/greenplum//bin/gpcheckperf -f /home/gpadmin/hostfile_gpcheckperfnet -r N -d /tmp
-------------------
-- NETPERF TEST
-------------------
====================
== RESULT
====================
Netperf bisection bandwidth test
gp1 -> gp2 = 81.110000
gp3 -> gp4 = 80.940000
gp2 -> gp1 = 87.330000
gp4 -> gp3 = 87.120000
Summary:
sum = 336.50 MB/sec
min = 80.94 MB/sec
max = 87.33 MB/sec
avg = 84.12 MB/sec
median = 87.12 MB/sec
2.检查磁盘I/O和内存带宽
使用gpadmin用户登录主节点主机。
创建一个名为hostfile_gpcheckperfio的文件,包含所有计算节点主机名,每个名称一行。不要包含主节点主机名。
gpcheckperf -f /home/gpadmin/hostfile_gpcheckperfio -r ds -d /data/segmentp -d /data/segmentm > io.out
[gpadmin@gp1 ~]$ cat io.out
/opt/gp/greenplum//bin/gpcheckperf -f /home/gpadmin/hostfile_gpcheckperfio -r ds -d /data/segmentp -d /data/segmentm
--------------------
-- DISK WRITE TEST
--------------------
--------------------
-- DISK READ TEST
--------------------
--------------------
-- STREAM TEST
--------------------
====================
== RESULT
====================
disk write avg time (sec): 18.36
disk write tot bytes: 6273368064
disk write tot bandwidth (MB/s): 334.08
disk write min bandwidth (MB/s): 140.84 [gp4]
disk write max bandwidth (MB/s): 193.24 [gp3]
disk read avg time (sec): 8.63
disk read tot bytes: 6273368064
disk read tot bandwidth (MB/s): 761.29
disk read min bandwidth (MB/s): 266.85 [gp3]
disk read max bandwidth (MB/s): 494.44 [gp4]
stream tot bandwidth (MB/s): 17422.21
stream min bandwidth (MB/s): 6680.49 [gp4]
stream max bandwidth (MB/s): 10741.71 [gp3]
初始化数据仓库
gpinitsystem,能够在主节点和每个计算节点实例上按照正确的顺序完成初始化和启动操作。
在数据仓库系统初始化并启动之后,就可以像使用常规的PostgreSQL数据库一样,通过连接到瀚高数据仓库主节点来创建和管理数据库。
创建数据仓库配置文件
数据仓库配置文件会告诉gpinitsystem命令按照什么样的方式配置系统。在软件的安装目录的$GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config文件可以作为配置的例子参考??
1. 以gpadmin用户登录:
$ su - gpadmin
2. 拷贝一个gpinitsystem_config示例文件。例如
$cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config
/home/gpadmin/gpinitsystem_config
3. 打开刚拷贝的文件并编辑。
DATA_DIRECTORY参数指定每个计算节点主机配置多少个计算节点实例。如果在host文件中为每个计算节点主机列出了多个网口,这些实例将平均分布到所有列出的网口上。
下面是一个gpinitsystem_config文件的例子:
ARRAY_NAME="HighGo DataWarehouse "
SEG_PREFIX=gpseg
PORT_BASE=40000
declare -a DATA_DIRECTORY=(/data1/primary /data1/primary
/data1/primary /data2/primary /data2/primary /data2/primary)
MASTER_HOSTNAME=mdw
MASTER_DIRECTORY=/data/master
MASTER_PORT=5432
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENT=8
ENCODING=UNICODE
4. 作为可选项,可以配置镜像计算节点实例,取消文件中的注释并根据环境情况配置参数。下面是gpinitsystem_config文件中可选配置镜像的例子:
MIRROR_PORT_BASE=50000
REPLICATION_PORT_BASE=41000
MIRROR_REPLICATION_PORT_BASE=51000
declare -a MIRROR_DATA_DIRECTORY=(/data1/mirror /data1/mirror
/data1/mirror /data2/mirror /data2/mirror /data2/mirror)
注意:可以在初始化时值配置主计算节点实例,而在之后使用gpaddmirrors命令部署镜像计算节点实例。
5. 保存关闭该文件。
su - gpadmincp /usr/local/hgdw/docs/cli_help/hgconfigs/hginitsystem_config /home/hgadmin/hginitsystem_config
cat hginitsystem_config
# FILE NAME: gpinitsystem_config
# Configuration file needed by the gpinitsystem
################################################
#### REQUIRED PARAMETERS
################################################
#### Name of this Greenplum system enclosed in quotes.
ARRAY_NAME="HGDW"
#### Naming convention for utility-generated data directories.
SEG_PREFIX=gpseg
#### Base number by which primary segment port numbers
#### are calculated.
PORT_BASE=6000
#### File system location(s) where primary segment data directories
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per
#### physical host (if multiple addresses for a host are listed in
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses).
declare -a DATA_DIRECTORY=(/data/hgdata/hgsegp)
#### OS-configured hostname or IP address of the master host.
MASTER_HOSTNAME=hgdw1
#### File system location where the master data directory
#### will be created.
MASTER_DIRECTORY=/data/hgdata/master
#### Port number for the master instance.
MASTER_PORT=5432
#### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=ssh
#### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=18
#### Default server-side character set encoding.
ENCODING=UNICODE
################################################
#### OPTIONAL MIRROR PARAMETERS
################################################
#### Base number by which mirror segment port numbers
#### are calculated.
MIRROR_PORT_BASE=7000
#### File system location(s) where mirror segment data directories
#### will be created. The number of mirror locations must equal the
#### number of primary locations as specified in the
#### DATA_DIRECTORY parameter.
declare -a MIRROR_DATA_DIRECTORY=(/data/hgdata/hgsegm)
################################################
#### OTHER OPTIONAL PARAMETERS
################################################
#### Create a database of this name after initialization.
DATABASE_NAME=hgdw
#### Specify the location of the host address file here instead of
#### with the the -h option of gpinitsystem.
#MACHINE_LIST_FILE=/home/gpadmin/gpconfigs/hostfile_gpinitsystem
初始化数据库
运行gpinitsystem命令将根据指定的配置文件创建一个瀚高数据仓库系统。
运行初始化命令
hginitsystem -c /home/hgadmin/hginitsystem_config -h /home/hgadmin/hostfile_seg -b 10GB -m 2000
命令将自动检查安装信息,确认可以连接到每个主机,可以访问配置文件中指定的每个目录。如果检查都通过了,将会提示确认配置信息。例如:
=> Continue with Greenplum creation? Yy/Nn
输入y以开始初始化。
该命令将初始化主节点实例和系统中的每个计算节点实例。每个计算节点实例的安装是并行的。 计算节点实例的数量决定这个过程使用的时间。
在成功安装初始化结束时, 命令将会启动瀚高数据仓库系统,并可以看到如下提示信息:
=> Database successfully started
hginitsystem -c /home/hgadmin/hginitsystem_config -h /home/hgadmin/hostfile_seg -b 10GB -m 2000
20200401:18:46:57:026913 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking configuration parameters, please wait...
20200401:18:46:57:026913 hginitsystem:hgdw1:hgadmin-[INFO]:-Reading HGDW configuration file /home/hgadmin/hginitsystem_config
20200401:18:46:57:026913 hginitsystem:hgdw1:hgadmin-[INFO]:-Locale has not been set in /home/hgadmin/hginitsystem_config, will set to default value
20200401:18:46:57:026913 hginitsystem:hgdw1:hgadmin-[INFO]:-Locale set to en_US.utf8
20200401:18:46:57:026913 hginitsystem:hgdw1:hgadmin-[FATAL]:-Expected file " /home/hgadmin/hostfile_seg" not found Script Exiting!
[hgadmin@hgdw1 master]$ hginitsystem -c /home/hgadmin/hginitsystem_config -h /home/hgadmin/hostfile_seg -b 10GB -m 2000
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking configuration parameters, please wait...
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Reading HGDW configuration file /home/hgadmin/hginitsystem_config
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Locale has not been set in /home/hgadmin/hginitsystem_config, will set to default value
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Locale set to en_US.utf8
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking configuration parameters, Completed
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Commencing multi-home checks, please wait...
.....
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Configuring build for standard array
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Commencing multi-home checks, Completed
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Building primary segment instance array, please wait...
.....
20200401:18:47:10:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Building group mirror array type , please wait...
.....
20200401:18:47:12:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking Master host
20200401:18:47:12:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking new segment hosts, please wait...
..........
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking new segment hosts, Completed
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HGDW Database Creation Parameters
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:---------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master Configuration
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:---------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master instance name = HGDW
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master hostname = hgdw1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master port = 5432
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master instance dir = /data/hgdata/master/gpseg-1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master LOCALE = en_US.utf8
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HGDW segment prefix = gpseg
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master Database = hgdw
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master connections = 2000
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master buffers = 10GB
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Segment connections = 6000
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Segment buffers = 10GB
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checkpoint segments = 18
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Encoding = UNICODE
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Postgres param file = Off
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Initdb to be used = /usr/local/hgdw/bin/initdb
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-GP_LIBRARY_PATH is = /usr/local/hgdw/lib
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HEAP_CHECKSUM is = on
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HBA_HOSTNAMES is = 0
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Ulimit check = Passed
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Array host connect type = Single hostname per node
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master IP address [1] = ::1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master IP address [2] = 172.17.105.139
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master IP address [3] = 192.168.122.1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master IP address [4] = fe80::f816:3eff:fe21:d1e6
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Standby Master = Not Configured
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Number of primary segments = 1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total Database segments = 5
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Trusted shell = ssh
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Number segment hosts = 5
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Mirror port base = 7000
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Number of mirror segments = 1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Mirroring config = ON
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Mirroring type = Group
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:----------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HGDW Primary Segment Configuration
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:----------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw2 /data/hgdata/hgsegp/gpseg0 6000 2 0
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw3 /data/hgdata/hgsegp/gpseg1 6000 3 1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw4 /data/hgdata/hgsegp/gpseg2 6000 4 2
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw5 /data/hgdata/hgsegp/gpseg3 6000 5 3
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw6 /data/hgdata/hgsegp/gpseg4 6000 6 4
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:---------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HGDW Mirror Segment Configuration
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:---------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw3 /data/hgdata/hgsegm/gpseg0 7000 7 0
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw4 /data/hgdata/hgsegm/gpseg1 7000 8 1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw5 /data/hgdata/hgsegm/gpseg2 7000 9 2
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw6 /data/hgdata/hgsegm/gpseg3 7000 10 3
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw2 /data/hgdata/hgsegm/gpseg4 7000 11 4
Continue with HGDW creation Yy|Nn (default=N):
> y
20200401:18:47:43:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Building the Master instance database, please wait...
20200401:18:50:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Starting the Master in admin mode
20200401:18:50:52:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Commencing parallel build of primary segment instances
20200401:18:50:52:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Spawning parallel processes batch [1], please wait...
.....
20200401:18:50:52:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
...........................................................................................................................................................................................
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Parallel process exit status
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as completed = 5
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as killed = 0
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as failed = 0
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Deleting distributed backout files
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Removing back out file
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-No errors generated from parallel processes
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Restarting the HGDW instance in production mode
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Starting hgstop with args: -a -l /home/hgadmin/gpAdminLogs -m -d /data/hgdata/master/gpseg-1
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Gathering information and validating the environment...
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Obtaining HGDW Master catalog information
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Obtaining Segment details from master...
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-HGDW Version: "postgres (HGDW Database) 3.0"
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Commencing Master instance shutdown with mode="smart"
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Master segment instance directory=/data/hgdata/master/gpseg-1
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Stopping master segment and waiting for user connections to finish ...
server shutting down
20200401:18:54:02:032368 hgstop:hgdw1:hgadmin-[INFO]:-Attempting forceful termination of any leftover master process
20200401:18:54:02:032368 hgstop:hgdw1:hgadmin-[INFO]:-Terminating processes for segment /data/hgdata/master/gpseg-1
20200401:18:54:03:032394 hgstart:hgdw1:hgadmin-[INFO]:-Starting hgstart with args: -a -l /home/hgadmin/gpAdminLogs -d /data/hgdata/master/gpseg-1
20200401:18:54:03:032394 hgstart:hgdw1:hgadmin-[INFO]:-Gathering information and validating the environment...
20200401:18:54:03:032394 hgstart:hgdw1:hgadmin-[INFO]:-HGDW Binary Version: "postgres (HGDW Database) 3.0"
20200401:18:54:03:032394 hgstart:hgdw1:hgadmin-[INFO]:-HGDW Catalog Version: "301908232"
20200401:18:54:03:032394 hgstart:hgdw1:hgadmin-[INFO]:-Starting Master instance in admin mode
20200401:18:54:11:032394 hgstart:hgdw1:hgadmin-[INFO]:-Obtaining HGDW Master catalog information
20200401:18:54:11:032394 hgstart:hgdw1:hgadmin-[INFO]:-Obtaining Segment details from master...
20200401:18:54:11:032394 hgstart:hgdw1:hgadmin-[INFO]:-Setting new master era
20200401:18:54:11:032394 hgstart:hgdw1:hgadmin-[INFO]:-Master Started...
20200401:18:54:11:032394 hgstart:hgdw1:hgadmin-[INFO]:-Shutting down master
20200401:18:54:12:032394 hgstart:hgdw1:hgadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
........
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-Process results...
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-----------------------------------------------------
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:- Successful segment starts = 5
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:- Failed segment starts = 0
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:- Skipped segment starts (segments are marked down in configuration) = 0
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-----------------------------------------------------
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-Successfully started 5 of 5 segment instances
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-----------------------------------------------------
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-Starting Master instance hgdw1 directory /data/hgdata/master/gpseg-1
20200401:18:54:29:032394 hgstart:hgdw1:hgadmin-[INFO]:-Command pg_ctl reports Master hgdw1 instance active
20200401:18:54:29:032394 hgstart:hgdw1:hgadmin-[INFO]:-Connecting to dbname="template1" connect_timeout=15
20200401:18:54:29:032394 hgstart:hgdw1:hgadmin-[INFO]:-No standby master configured. skipping...
20200401:18:54:29:032394 hgstart:hgdw1:hgadmin-[INFO]:-Database successfully started
20200401:18:54:29:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Completed restart of HGDW instance in production mode
20200401:18:54:30:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Commencing parallel build of mirror segment instances
20200401:18:54:30:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Spawning parallel processes batch [1], please wait...
.....
20200401:18:54:30:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
............
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Parallel process exit status
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as completed = 5
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as killed = 0
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as failed = 0
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Scanning utility log file for any warning messages
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[WARN]:-*******************************************************
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[WARN]:-Scan of log file indicates that some warnings or errors
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[WARN]:-were generated during the array creation
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Please review contents of log file
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-/home/hgadmin/gpAdminLogs/hginitsystem_20200401.log
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-To determine level of criticality
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-These messages could be from a previous run of the utility
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-that was called today!
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[WARN]:-*******************************************************
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HGDW Database instance successfully created
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-------------------------------------------------------
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-To complete the environment configuration, please
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-update hgadmin .bashrc file with the following
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-1. Ensure that the hgdw_path.sh file is sourced
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/data/hgdata/master/gpseg-1"
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:- to access the HGDW scripts for this instance:
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:- or, use -d /data/hgdata/master/gpseg-1 option for the HGDW scripts
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:- Example gpstate -d /data/hgdata/master/gpseg-1
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Script log file = /home/hgadmin/gpAdminLogs/hginitsystem_20200401.log
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-To initialize a Standby Master Segment for this HGDW instance
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Review options for gpinitstandby
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-------------------------------------------------------
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-The Master /data/hgdata/master/gpseg-1/pg_hba.conf post gpinitsystem
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-has been configured to allow all hosts within this new
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-new array must be explicitly added to this file
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Refer to the HGDW Admin support guide which is
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-located in the /usr/local/hgdw/docs directory
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-------------------------------------------------------
[hgadmin@hgdw1 master]$
失败回退
安装中途失败,提示使用 bash /home/gpadmin/gpAdminLogs/backout_gpinitsystem_gpadmin_* 回退,执行该脚本即可
删除重建
安装完成,出于种种原因,若需要集群删除重装,使用 gpdeletesystem 工具使用命令
gpdeletesystem -d /data/hgdata/master/gpseg-1 -f
-d 后面跟 MASTER_DATA_DIRECTORY(master 的数据目录),会清除master,segment所有的数据目录。
-f force, 终止所有进程,强制删除。
环境变量确认
必须在瀚高数据仓库主节点(包括备用主节点)设置相应的环境变量。在$GPHOME目录下的greenplum_path.sh文件提供了诸多瀚高数据仓库相关的环境变量。可以在gpadmin用户启动脚本(比如.bashrc)中加载这个文件。
瀚高数据仓库的管理命令还需要设置MASTER_DATA_DIRECTORY环境变量。这个变量应该指向gpinitsystem命令初始化时指定的主节点数据目录位置。
vi ~/.bash_profilesource /opt/gp/greenplum/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=hgdw
#export LD_LIBRARY_PATH=$GPHOME/lib:$GPHOME/ext/python/lib
如果有备用主节点,应该将环境变量的文件拷贝到备用主节点。例如:
scp ~/.bash_profile gp2:/home/gpadmin
3 常用命令
查看状态gpstate
快速打开
gpstart
快速关闭
gpstop -M fast
不关闭数据库的情况下生效某些参数
stop -u 生效
4 poststep
优化数据库参数,参考相关文档
greenplum or hgdw Best practices(Turing)
5 其他配置
允许客户端连接
瀚高数据仓库第一次初始化之后,只允许gpadmin用户从本地进行连接(gpinitsystem时指定的系统用户)。如果希望其他用户或者客户端机器也能够访问瀚高数据仓库,必须设置数据库的访问权限。
hgdw=# show listen_addresses ; listen_addresses
------------------
*
(1 row)
listen_addresses默认就是*
修改pg_hba.conf文件
cat pg_hba.conf |grep -v "#"
host all gpadmin 192.168.80.147/32 trust
host replication gpadmin 192.168.80.147/32 trust
host all gpadmin 192.168.122.1/32 trust
host replication gpadmin 192.168.122.1/32 trust
local all gpadmin ident
host all gpadmin 127.0.0.1/28 trust
host all gpadmin 192.168.80.146/32 trust
host all gpadmin 192.168.122.1/32 trust
host all gpadmin 192.168.80.147/32 trust
host all gpadmin 192.168.122.1/32 trust
host all gpadmin ::1/128 trust
host all gpadmin fe80::a00:27ff:fe7a:276e/128 trust
local replication gpadmin ident
host replication gpadmin samenet trust
# gp configure
host all all 192.168.80.0/24 md5
创建数据库并加载数据
验证安装成功之后,用户就可以创建数据库并加载数据了。关于创建数据库、模式、表或其他数据库对象的更多信息,请查看瀚高数据仓库管理员手册。
test_product.csv
0001,T 恤 , 衣服 ,1000,500,2017-09-20
0002, 打孔器 , 办公用品 ,500,320,2017-09-11
0003, 运动 T 恤 , 衣服 ,4000,2800,
0004, 菜刀 , 厨房用具 ,3000,2800,2017-09-20
0005, 高压锅 , 厨房用具 ,6800,5000,2017-01-15
0006, 叉子 , 厨房用具 ,500,,2017-09-20
0007, 切菜板 , 厨房用具 ,880,790,2017-04-28
0008, 圆珠笔 , 办公用品 ,100,,2017-11-11
nohup gpfdist -d /home/gpadmin/exttable -p 8888 -l /home/gpadmin/exttable/gpfdist.log 2>&1 &
create external table test_ext(
p_id int,
p_name text,
p_type text,
sale_price int,
pruchase_price int,
regist_date date
)
location ("gpfdist://192.168.80.146:8888/test_product.csv")
format "CSV" (delimiter as "," null as "")
;
hgdw=# select * from test_ext;
p_id | p_name | p_type | sale_price | pruchase_price | regist_date
------+-----------+-----------+------------+----------------+-------------
1 | T恤 | 衣服 | 1000 | 500 | 2017-09-20
2 | 打孔器 | 办公用品 | 500 | 320 | 2017-09-11
3 | 运动T恤 | 衣服 | 4000 | 2800 |
4 | 菜刀 | 厨房用具 | 3000 | 2800 | 2017-09-20
5 | 高压锅 | 厨房用具 | 6800 | 5000 | 2017-01-15
6 | 叉子 | 厨房用具 | 500 | | 2017-09-20
7 | 切菜板 | 厨房用具 | 880 | 790 | 2017-04-28
8 | 圆珠笔 | 办公用品 | 100 | | 2017-11-11
(8 rows)
create table test_product as select * from test_ext;
Created By Chris on 20200327
以上是 greenplum安装参考 的全部内容, 来源链接: utcz.com/z/533626.html