环境要求

  • CentOS 7
  • MySQL 5.7
  • Apache httpd
  • JDK1.8 (推荐用最新版,低版本出现过bug)
  • mysql-connector-java 不低于5.1.26即可。

系统要求

/usr:该分区至少需要5GB

/var: 该分区至少需要5G,实际使用中,如果不修改monitor日志目录的话,5G是不够的。建议50G。

/opt :该分区需要20G左右

CDH-DB: 数据库服务至少需要5GB的容量

参考

数据库要求

数据库需要安装mysql5.7的版本,并且字符集必须是utf-8,排序规则为utf8_general_ci,必须安装MySQL-shared-compatMySQL-shared包。

参考

Java要求

目前应该只支持JDK1.8,下载小版本号比较高的即可。注意避开JDK-8245417JDK-8256818JDK 8u271JDK 8u281 JDK 8u291JDK 8u408u458u60这几个版本。

参考

系统配置

  • 关闭防火墙
  • 禁止开机启动防火墙
  • 配置主机名,SSH互相访问
  • 配置NTP,确保各机器时间一致
  • 有yum源(挂在DVD或者本身提供yum源)
  • 关闭交换空间

需要下载的包

由于CDH关闭了免费下载的渠道,这里是已经下载好的安装包:

有需要这些包的可以留言

CDH Manager

1
2
3
4
5
6
7
cloudera-manager-server-db-2-6.3.1-1466458.el7.x86_64.rpm
cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm
cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm
cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm
enterprise-debuginfo-6.3.1-1466458.el7.x86_64.rpm
oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm
allkeys.asc

CDH parcel

1
2
3
4
manifest.json
CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha256
CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha1
CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel

CDH的包放在任意一个目录即可,例如:/data/cloudera-repos

JDK 8u212

1
2
- jdk-8u212-linux-x64.tar.gz
+ jdk-8u301-linux-x64.rpm

MySQL

1
mysql-5.7.34-1.el7.x86_64.rpm-bundle.tar

MySQL :: Download MySQL Community Server (Archived Versions)

mysql-connector-java

1
mysql-connector-java-5.1.47.tar.gz

安装前步骤

关闭防火墙

1
2
systemctl disable firewalld
systemctl stop firewalld

关闭系统SELinux

vim /etc/sysconfig/selinux

1
2
- SELINUX=enforcing
+ SELINUX=disabled

修改完重启系统。

修改主机名

1
2
3
4
5
hostnamectl set-hostname master
hostnamectl set-hostname slave1
hostnamectl set-hostname slave2
hostnamectl set-hostname slave3
hostnamectl set-hostname slave4

配置hosts

vim /etc/hosts

1
2
10.10.10.10 master
10.10.10.11 slave1

3.5 同步时间

在每台服务器上都执行一下,修改完成以后,再去配置ntp服务。

1
ntpdate -u master

设置系统交换内存

CDH官方建议设置为最小值1,一般我都是关闭交换分区的,这一块还是按照官方建议来吧。

1
2
3
vim /etc/sysctl.conf

vm.swappiness=1

这个操作需要reboot。也可以临时改变

1
sudo sysctl -w vm.swappiness=1

参考链接

安装JDK,配置JAVA_HOME

1
2
3
4
5
6
7
8
mkdir /usr/local/java
tar -zxvf jdk-8u212-linux-x64.tar.gz -C /usr/local/java/

# 如果本机已经安装过openjdk
alternatives --install /usr/bin/java java /usr/local/java/jdk1.8.0_212/bin/java 3
alternatives --install /usr/bin/javac javac /usr/local/java/jdk1.8.0_212/bin/javac 3
# 选择刚才添加的序号即可
alternatives --config java

我总共用三个环境安装,其中一个出现了jdk的错误,如果出现这个错误,请下载官方最新的rpm包,并且使用alternatives切换为最新的jdk即可。

为保险起见,推荐用rpm安装。

安装MySQL

解压

1
tar -xvf mysql-5.7.34-1.el7.x86_64.rpm-bundle.tar 

安装MySQL

1
2
3
4
rpm -ivh mysql-community-libs-5.7.34-1.el7.x86_64.rpm --nodeps --force
rpm -ivh mysql-community-devel-5.7.34-1.el7.x86_64.rpm --nodeps --force
rpm -ivh mysql-community-client-5.7.34-1.el7.x86_64.rpm --nodeps --force
rpm -ivh mysql-community-server-5.7.34-1.el7.x86_64.rpm --nodeps --force

启动MySQL服务

1
2
3
4
5
6
# 初始化mysql
mysqld --initialize --user=mysql
# 设置访问权限
chown mysql:mysql -R /var/lib/mysql
# 启动mysql
service mysqld restart

这一块如果启动不起来,那一般都是权限问题,注意第二个步骤的操作。

查看默认root密码

1
grep 'temporary password' /var/log/mysqld.log

修改root密码

1
2
3
4
5
6
7
8
9
10
mysql -uroot -p # 输入刚才打印出的密码
# 重新设置密码
SET PASSWORD = PASSWORD('123456');
Query OK, 0 rows affected, 1 warning (0.00 sec)
flush privileges;

# 修改root限制host
use mysql;
update user set host='%' where user='root';
flush privileges;

配置开机启动

1
systemctl enable mysqld.service

CDH官方建议

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1
skip-ssl
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
sql_mode=STRICT_ALL_TABLES

系统优化

关闭禁用透明重复页面

临时生效

1
2
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

永久生效

1
2
echo 'echo never > /sys/kernel/mm/transparent_hugepage/defrag' >> /etc/rc.d/rc.local
echo 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.d/rc.local

修改文件句柄数

修改本次运行期间的句柄数

1
ulimit -n 65535

修改操作系统重启时默认的句柄数

1
2
3
4
5
6
7
8
vim /etc/security/limits.conf


* soft nofile 65536
* hard nofile 131072
# 打开进程数
* soft nproc 65535
* hard nproc 65535

内核参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
fs.file-max=1000000
net.ipv4.tcp_max_tw_buckets = 6000
net.ipv4.tcp_sack = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_wmem = 4096 16384 4194304
net.ipv4.tcp_max_syn_backlog = 16384
net.core.netdev_max_backlog = 32768
net.core.somaxconn = 32768
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_fin_timeout = 20
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_syncookies = 1
#net.ipv4.tcp_tw_len = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 94500000 915000000 927000000
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.ip_local_port_range = 1024 65000
net.nf_conntrack_max = 6553500
net.netfilter.nf_conntrack_max = 6553500
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_established = 3600
vm.swappiness=1

安装httpd

由于无网环境,这里需要搭建一个CDH的yum源

1
yum -y install httpd createrepo

启动httpd服务并设置开机自启动

1
2
systemctl start httpd
systemctl enable httpd

生成RPM元数据

1
2
3
4
5
6
7
8
9
10
11
12
cd /data/cloudera-repos/
ll
-rw-r--r-- 1 root root 10483568 Jul 21 11:51 cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 root root 1203832464 Jul 21 11:51 cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 root root 11488 Jul 21 11:51 cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 root root 10996 Jul 21 11:51 cloudera-manager-server-db-2-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 root root 14209868 Jul 21 11:51 enterprise-debuginfo-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 root root 184988341 Jul 21 11:51 oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm
# 生成元数据
createrepo .
# 赋予权限
chmod 777 -R /data/cloudera-repos

然后将cloudera-repos目录移动到httpd的html目录下:

1
mv cloudera-repos /var/www/html/

访问http://10.10.100.197/cloudera-repos/如下所示:

1
2
3
4
5
6
7
8
9

[PARENTDIR] Parent Directory -
[ ] cloudera-manager-age..> 2021-07-21 11:52 10M
[ ] cloudera-manager-dae..> 2021-07-21 11:52 1.1G
[ ] cloudera-manager-ser..> 2021-07-21 11:52 11K
[ ] cloudera-manager-ser..> 2021-07-21 11:52 11K
[ ] enterprise-debuginfo..> 2021-07-21 11:52 14M
[ ] oracle-j2sdk1.8-1.8...> 2021-07-21 11:52 176M
[DIR] repodata/ 2021-07-21 11:52 -

接着在创建CDH6的repo文件(每个节点都需要配置):

1
2
cd /etc/yum.repos.d
vim cloudera-manager.repo

内容如下:

1
2
3
4
5
[cloudera-manager]
name=Cloudera Manager 6.3.0
baseurl=http://10.10.100.197/cloudera-repos/
gpgcheck=0
enabled=1

保存,退出,然后执行yum clean all && yum makecache命令:

1
2
3
(3/21): cloudera-manager/filelists_db                                                                                                                                                | 118 kB  00:00:00     
(4/21): cloudera-manager/primary_db | 8.7 kB 00:00:00
(5/21): cloudera-manager/other_db | 1.0 kB 00:00:00

没有Yum源的情况

可下载CentOS的DVD镜像,然后挂在到服务器上。

1
2
mount -o loop /data/repos/CentOS-7-x86_64-DVD-1810.iso /mnt

只用于本地

1
2
3
4
5
6
7
8
cd /etc/yum.repos.d/
vim centos7.repo
# 填写如下内容
[centos7-iso]
name=centos7-iso
baseurl=file:///mnt/
enabled=1
gpgcheck=0

用于局域网的Yum源

1
2
cd /mnt
python -m SimpleHTTPServer 端口号

或者安装httpd服务,将这些数据挂载到/var/www/html目录。

然后创建repo

1
2
3
4
5
[Local]
name=Local Repository
baseurl=http://10.10.100.197
gpgcheck=0
enabled=1

如果仓库里没有

安装CDH Manager

在CM Server上安装Cloudera Manager Server

1
2
# 安装 cm manager(只需在server节点安装)
yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server

这里会自动去安装一些依赖,如果没有配置yum源可能会报错。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
[root@master data]# yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: mirrors.tuna.tsinghua.edu.cn
* extras: mirrors.tuna.tsinghua.edu.cn
* updates: mirrors.bfsu.edu.cn
Resolving Dependencies
--> Running transaction check
---> Package cloudera-manager-agent.x86_64 0:6.3.1-1466458.el7 will be installed
--> Processing Dependency: python-psycopg2 for package: cloudera-manager-agent-6.3.1-1466458.el7.x86_64
--> Processing Dependency: mod_ssl for package: cloudera-manager-agent-6.3.1-1466458.el7.x86_64
--> Processing Dependency: MySQL-python for package: cloudera-manager-agent-6.3.1-1466458.el7.x86_64
--> Processing Dependency: /lib/lsb/init-functions for package: cloudera-manager-agent-6.3.1-1466458.el7.x86_64
--> Processing Dependency: libpq.so.5()(64bit) for package: cloudera-manager-agent-6.3.1-1466458.el7.x86_64
---> Package cloudera-manager-daemons.x86_64 0:6.3.1-1466458.el7 will be installed
---> Package cloudera-manager-server.x86_64 0:6.3.1-1466458.el7 will be installed
--> Running transaction check
---> Package MySQL-python.x86_64 0:1.2.5-1.el7 will be installed
---> Package mod_ssl.x86_64 1:2.4.6-97.el7.centos will be installed
---> Package postgresql-libs.x86_64 0:9.2.24-7.el7_9 will be installed
---> Package python-psycopg2.x86_64 0:2.5.1-4.el7 will be installed
---> Package redhat-lsb-core.x86_64 0:4.1-27.el7.centos.1 will be installed
--> Processing Dependency: redhat-lsb-submod-security(x86-64) = 4.1-27.el7.centos.1 for package: redhat-lsb-core-4.1-27.el7.centos.1.x86_64
--> Processing Dependency: spax for package: redhat-lsb-core-4.1-27.el7.centos.1.x86_64
--> Running transaction check
---> Package redhat-lsb-submod-security.x86_64 0:4.1-27.el7.centos.1 will be installed
---> Package spax.x86_64 0:1.5.2-13.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

============================================================================================================================================================================================================
Package Arch Version Repository Size
============================================================================================================================================================================================================
Installing:
cloudera-manager-agent x86_64 6.3.1-1466458.el7 cloudera-manager 10 M
cloudera-manager-daemons x86_64 6.3.1-1466458.el7 cloudera-manager 1.1 G
cloudera-manager-server x86_64 6.3.1-1466458.el7 cloudera-manager 11 k
Installing for dependencies:
MySQL-python x86_64 1.2.5-1.el7 base 90 k
mod_ssl x86_64 1:2.4.6-97.el7.centos updates 114 k
postgresql-libs x86_64 9.2.24-7.el7_9 updates 235 k
python-psycopg2 x86_64 2.5.1-4.el7 base 132 k
redhat-lsb-core x86_64 4.1-27.el7.centos.1 base 38 k
redhat-lsb-submod-security x86_64 4.1-27.el7.centos.1 base 15 k
spax x86_64 1.5.2-13.el7 base 260 k

Transaction Summary
============================================================================================================================================================================================================
Install 3 Packages (+7 Dependent packages)

Total download size: 1.1 G
Installed size: 1.4 G
Is this ok [y/d/N]: y
Downloading packages:
(1/10): cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm | 10 MB 00:00:00
(2/10): cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm | 11 kB 00:00:00
(3/10): MySQL-python-1.2.5-1.el7.x86_64.rpm | 90 kB 00:00:00
(4/10): redhat-lsb-core-4.1-27.el7.centos.1.x86_64.rpm | 38 kB 00:00:00
(5/10): redhat-lsb-submod-security-4.1-27.el7.centos.1.x86_64.rpm | 15 kB 00:00:00
(6/10): python-psycopg2-2.5.1-4.el7.x86_64.rpm | 132 kB 00:00:00
(7/10): spax-1.5.2-13.el7.x86_64.rpm | 260 kB 00:00:00
(8/10): cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm | 1.1 GB 00:00:18
(9/10): mod_ssl-2.4.6-97.el7.centos.x86_64.rpm | 114 kB 00:00:17
(10/10): postgresql-libs-9.2.24-7.el7_9.x86_64.rpm | 235 kB 00:00:17
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 64 MB/s | 1.1 GB 00:00:18
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : postgresql-libs-9.2.24-7.el7_9.x86_64 1/10
Installing : cloudera-manager-daemons-6.3.1-1466458.el7.x86_64 2/10
Installing : python-psycopg2-2.5.1-4.el7.x86_64 3/10
Installing : MySQL-python-1.2.5-1.el7.x86_64 4/10
Installing : spax-1.5.2-13.el7.x86_64 5/10
Installing : 1:mod_ssl-2.4.6-97.el7.centos.x86_64 6/10
Installing : redhat-lsb-submod-security-4.1-27.el7.centos.1.x86_64 7/10
Installing : redhat-lsb-core-4.1-27.el7.centos.1.x86_64 8/10
Installing : cloudera-manager-agent-6.3.1-1466458.el7.x86_64 9/10
Created symlink from /etc/systemd/system/multi-user.target.wants/cloudera-scm-agent.service to /usr/lib/systemd/system/cloudera-scm-agent.service.
Created symlink from /etc/systemd/system/multi-user.target.wants/supervisord.service to /usr/lib/systemd/system/supervisord.service.
Installing : cloudera-manager-server-6.3.1-1466458.el7.x86_64 10/10
Created symlink from /etc/systemd/system/multi-user.target.wants/cloudera-scm-server.service to /usr/lib/systemd/system/cloudera-scm-server.service.
Verifying : redhat-lsb-submod-security-4.1-27.el7.centos.1.x86_64 1/10
Verifying : cloudera-manager-daemons-6.3.1-1466458.el7.x86_64 2/10
Verifying : 1:mod_ssl-2.4.6-97.el7.centos.x86_64 3/10
Verifying : python-psycopg2-2.5.1-4.el7.x86_64 4/10
Verifying : cloudera-manager-server-6.3.1-1466458.el7.x86_64 5/10
Verifying : cloudera-manager-agent-6.3.1-1466458.el7.x86_64 6/10
Verifying : spax-1.5.2-13.el7.x86_64 7/10
Verifying : redhat-lsb-core-4.1-27.el7.centos.1.x86_64 8/10
Verifying : MySQL-python-1.2.5-1.el7.x86_64 9/10
Verifying : postgresql-libs-9.2.24-7.el7_9.x86_64 10/10

Installed:
cloudera-manager-agent.x86_64 0:6.3.1-1466458.el7 cloudera-manager-daemons.x86_64 0:6.3.1-1466458.el7 cloudera-manager-server.x86_64 0:6.3.1-1466458.el7

Dependency Installed:
MySQL-python.x86_64 0:1.2.5-1.el7 mod_ssl.x86_64 1:2.4.6-97.el7.centos postgresql-libs.x86_64 0:9.2.24-7.el7_9 python-psycopg2.x86_64 0:2.5.1-4.el7
redhat-lsb-core.x86_64 0:4.1-27.el7.centos.1 redhat-lsb-submod-security.x86_64 0:4.1-27.el7.centos.1 spax.x86_64 0:1.5.2-13.el7

Complete!

在其他节点安装agent

1
yum install cloudera-manager-daemons cloudera-manager-agent 

安装完成后,修改/etc/cloudera-scm-agent/config.ini

1
2
3
vim /etc/cloudera-scm-agent/config.ini 
+ server_host=10.10.100.197
server_host=localhost

将所有agent指向cloudera-manager-server的地址。

配置本地Parcel存储库

Cloudera Manager Server安装完成后,进入到本地Parcel存储库目录:

1
cd /opt/cloudera/parcel-repo

将下载的CDH parcels文件上传到该目录,然后修改sha文件

1
2
cp ~/cdh/cdh-m/* .
mv CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha1 CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha

然后执行下面的命令修改文件所有者:

1
chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo

配置JDBC驱动

最好每台服务器都配置,可以放在/usr/share/java中也可以直接放进CDH的lib库中

mysql-connector-java-5.1.47.tar.gz解压,并将mysql-connector-java-5.1.47-bin.jar拷贝到/usr/share/java

目录

1
2
3
# 没有的话就创建
mkdir /usr/share/java
cp mysql-connector-java-5.1.47-bin.jar /usr/share/java/mysql-connector-java.jar

创建CDH所需要的数据库

服务名 数据库名 用户名
Cloudera Manager Server scm scm
Activity Monitor amon amon
Reports Manager rman rman
Hue hue hue
Hive Metastore Server metastore hive
Sentry Server sentry sentry
Cloudera Navigator Audit Server nav nav
Cloudera Navigator Metadata Server navms navms
Oozie oozie oozie

其创建脚本如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# scm
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm';

# amon
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';

# rman
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';

# hue
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';

# hive
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive';

# sentry
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';

# nav
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';

# navms
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';

# oozie
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';

# flush
FLUSH PRIVILEGES;

设置Cloudera Manager 数据库

Cloudera Manager Server包含一个配置数据库的脚本。

  • mysql数据库与CM Server是同一台主机
    执行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
  • mysql数据库与CM Server不在同一台主机上
    执行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h <mysql-host-ip> --scm-host <cm-server-ip> scm scm
1
2
3
4
5
6
7
8
9
[root@master parcel-repo]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h 10.10.100.197 --scm-host 10.10.100.197 scm scm  
Enter SCM password:
JAVA_HOME=/usr/local/java/jdk1.8.0_212
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing: /usr/local/java/jdk1.8.0_212/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
Wed Jul 21 13:46:01 CST 2021 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly!

安装CDH 节点(agent)

启动Cloudera Manager Server服务

1
systemctl start cloudera-scm-server

然后等待Cloudera Manager Server启动,可能需要稍等一会儿,可以通过命令tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log去监控服务启动状态。

当看到INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.日志打印出来后,说明服务启动成功,可以通过浏览器访问Cloudera Manager WEB界面了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
2021-07-21 13:48:55,430 INFO WebServerImpl:org.eclipse.jetty.server.Server: jetty-9.4.14.v20181114; built: 2018-11-14T21:20:31.478Z; git: c4550056e785fb5665914545889f21dc136ad9e6; jvm 1.8.0_222-b10
2021-07-21 13:48:55,446 INFO WebServerImpl:org.eclipse.jetty.server.AbstractConnector: Started ServerConnector@64610985{HTTP/1.1,[http/1.1]}{0.0.0.0:7180}
2021-07-21 13:48:55,447 INFO WebServerImpl:org.eclipse.jetty.server.Server: Started @60970ms
2021-07-21 13:48:55,447 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
2021-07-21 13:48:55,642 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest
java.util.concurrent.ExecutionException: java.net.ConnectException: connection timed out: archive.cloudera.com/151.101.72.167:443
at com.ning.http.client.providers.netty.future.NettyResponseFuture.abort(NettyResponseFuture.java:231)
at com.ning.http.client.providers.netty.request.NettyConnectListener.onFutureFailure(NettyConnectListener.java:137)
at com.ning.http.client.providers.netty.request.NettyConnectListener.operationComplete(NettyConnectListener.java:145)
at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:409)
at org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:400)
at org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:362)
at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:142)
at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: connection timed out: archive.cloudera.com/151.101.72.167:443
at com.ning.http.client.providers.netty.request.NettyConnectListener.onFutureFailure(NettyConnectListener.java:133)
... 13 more
Caused by: org.jboss.netty.channel.ConnectTimeoutException: connection timed out: archive.cloudera.com/151.101.72.167:443
at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:139)
... 8 more

启动cloudera-scm-agent的所有节点

1
2
3
systemctl start cloudera-scm-agent
# 或者
service cloudera-scm-agent start

然后访问http://10.10.100.197:7180/cmf/login,默认帐号密码都是admin

欢迎界面

cdh-manager-welcome

点击页面右下角的【继续】按钮进行下一步:

接受条款

cdh-manager-accept-license

勾选条款,点击继续

选择(费用)版本

cdh-manager-select-edition

选择免费或者试用版都可以,以后不打算付费的话可以选择免费,不然后面会经常提醒试用过期。

集群安装欢迎界面

cdh-manager-cluster-welcome

继续

输入集群名称

cdh-manager-cluster-basics

选择主机

这一步是要搜索并选择用于安装CDH集群的主机,在主机名称后面的输入框中输入各个节点的hostname,中间使用英文逗号分隔开,然后点击搜索,在结果列表中勾选要安装CDH的节点即可:

cdh-manager-specify-hosts

如果各agent节点已经安装并启动可以在当前受管的标签选择,并继续。

指定存储库

cdh-manager-repo-other-software

这里选择自定义,填写上面使用httpd搭建好的Cloudera Manager YUM 库URL:http://10.10.100.197/cloudera-repos/

CDH and other software

如果之前的【配置本地Parcel存储库】步骤操作无误的话,这里会自动选择【使用Parcel】,并加载出CDH版本,确认无误后点击【继续】.

这一步如果是自动配置的话,因没有修改/etc/cloudera-scm-agent/config.ini导致选择主机的时候报错,本次教程是手动安装的,所以不存在这个问题。

JDK

cdh-manager-install-jdk

因为我们已经安装过JDK了,所以这一步直接点继续

SSH登录配置

cdh-manager-ssh

这里需要将所有的机器的root密码统一一下,根据集群配置填写合适的【同时安装数量】值即可。

安装Agent

到这一步会自动进行节点Agent的安装,稍等一会儿,即可安装完成:

cdh-manager-install-agents

出现进度条

cdh-manager-agent-installing

如果是手动安装,这一步是不需要关注的。

这一步可能出现的问题:

host command not found

1
yum install bind-utils -y

如果出现DNS反向解析

1
mv /usr/bin/host /usr/bin/host.bak

安装停止在获取安装锁

1
rm -f /tmp/.scm_prepare_node.lock

安装Parcels

这一步同样是自动安装,分配步骤的速度主要取决于网络环境,耐心等待即可:

cdh-manager-install-parcels

cdn-manager-parcels-installing

主机检查

cdh-manager-inspect-cluster

可能会遇到的问题:

已启用透明大页面压缩,可能会导致重大性能问题。请运行“echo never > /sys/kernel/mm/transparent_hugepage/defrag”和“echo never > /sys/kernel/mm/transparent_hugepage/enabled”以禁用此设置,然后将同一命令添加到 /etc/rc.local 等初始化脚本中,以便在系统重启时予以设置。以下主机将受到影响:

解决方案:

1
2
echo never>/sys/kernel/mm/transparent_hugepage/defrag
echo never>/sys/kernel/mm/transparent_hugepage/enabled

点击继续即可完成CDH-Manager的安装与部署。接着会出现安装服务器的引导页面,这里只安装一个zookeeper即可,也可以跳过。

cdh-manager-run-the-first-time

最终效果

cdh-manager-install-success

释放用户

默认使用root用户的话,会出现权限错误,此时,需要将hdfshive用户释放出来。

1
2
3
4
vim /etc/passwd

+ hdfs:x:517:516:Hadoop HDFS:/var/lib/hadoop-hdfs:/bin/bash
- hdfs:x:517:516:Hadoop HDFS:/var/lib/hadoop-hdfs:/sbin/nologin

其他说明

解决desc命令注释中文乱码:修改hive存储在mysql里的元数据相关信息

1
use metastore;

1).修改字段注释字符集

1
alter table COLUMNS_V2 modify column COMMENT varchar(256) character set utf8;

2).修改表注释字符集

1
alter table TABLE_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;

3).修改分区表参数,以支持分区键能够用中文表示

1
2
3
alter table PARTITION_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;

alter table PARTITION_KEYS modify column PKEY_COMMENT varchar(4000) character set utf8;

4).修改索引注解

1
alter table INDEX_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8