Centos7在线部署CDH-6.1.1集群(一)

一、cdh说明

CDH是Hadoop众多分支中的一种,由Cloudera维护,基于稳定版本的Apache Hadoop构建,全称Cloudera’s Distribution, including Apache Hadoop,通过cdh方便托管各类服务,供了Hadoop的核心可扩展存储(HDFS)和分布式计算(MR),同时还提供了WEB页面进行管理、监控

二、cdh部署

2.1.节点规划

阿里云ECS 172.19.159.5 8G server && agent
阿里云ECS 172.19.159.4 8G agent
阿里云ECS 172.19.159.6 8G agent
系统均为Centos7.5 64bit
网络均为阿里云专用网络,以上地址均为同一网段内部地址
系统磁盘均为50G

2.2.公共环境配置(cdh1 cdh2 cdh3)

修改hosts
vi /etc/hosts

172.19.159.5    cdh1
172.19.159.4    cdh2
172.19.159.6    cdh3   

禁用ipv6
阿里云服务器已经默认禁用了,文件如下
cat /etc/modprobe.d/disable_ipv6.conf

alias net-pf-10 off
options ipv6 disable=1

如果是自建虚拟机,作如下配置
vi /etc/modprobe.d/dist.conf

alias net-pf-10 off
alias ipv6 off   

新建普通用户
useradd hadoop
passwd hadoop

授权普通用户sudo权限
vi /etc/sudoers

hadoop ALL=(root)NOPASSWD:ALL

防火墙关闭
阿里云服务器已默认关闭firewalld、selinux

设置文件打开数
vi /etc/security/limits.conf

* soft nofile 65535
* hard nofile 65535
* soft nproc 32000
* hard nproc 32000

更换yum源
不需要更换,若是自建虚拟机可以更换为163的源

添加cloudera-manager.repo源
vi /etc/yum.repos.d/cloudera-manager.repo

[cloudera-manager]
# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 6 x86_64 
name=Cloudera Manager
baseurl=https://archive.cloudera.com/cm6/6.1.1/redhat7/yum/
gpgkey =https://archive.cloudera.com/cm6/6.1.1/redhat7/yum/RPM-GPG-KEY-cloudera 
gpgcheck = 1

jdk安装
yum install oracle-j2sdk1.8 -y
vi /etc/profile

export JAVA_HOME=/usr/java/jdk1.8.0_141-cloudera
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH 

硬盘优化
阿里云默认已优化
若是自建虚拟机,作如下操作
echo "vm.swappiness=0" >> /etc/sysctl.conf

2.3.时间同步和免密传输(cdh1 cdh2 cdh3)

ntp服务
阿里云默认已设置好时间同步
免密传输
以cdh1为例
ssh-keygen
ssh-copy-id cdh2
ssh-copy-id cdh3
在cdh2和cdh3上分别操作实现节点间的免密传输

2.4.mysql5.7安装(cdh1)

下载mysql依赖
wget https://dev.mysql.com/get/mysql80-community-release-el7-1.noarch.rpm \
&& rpm --import /etc/pki/rpm-gpg/RPM* \
&& rpm -Uvh mysql80-community-release-el7-1.noarch.rpm
添加mysql仓库
vi /etc/yum.repos.d/mysql-community.repo

# Enable to use MySQL 5.5
[mysql55-community]
name=MySQL 5.5 Community Server
baseurl=http://repo.mysql.com/yum/mysql-5.5-community/el/7/$basearch/
enabled=0
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql

# Enable to use MySQL 5.6
[mysql56-community]
name=MySQL 5.6 Community Server
baseurl=http://repo.mysql.com/yum/mysql-5.6-community/el/7/$basearch/
enabled=0
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql

# Enable to use MySQL 5.7
[mysql57-community]
name=MySQL 5.7 Community Server
baseurl=http://repo.mysql.com/yum/mysql-5.7-community/el/7/$basearch/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql

[mysql80-community]
name=MySQL 8.0 Community Server
baseurl=http://repo.mysql.com/yum/mysql-8.0-community/el/7/$basearch/
enabled=0
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql

[mysql-connectors-community]
name=MySQL Connectors Community
baseurl=http://repo.mysql.com/yum/mysql-connectors-community/el/7/$basearch/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql

[mysql-tools-community]
name=MySQL Tools Community
baseurl=http://repo.mysql.com/yum/mysql-tools-community/el/7/$basearch/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql

[mysql-tools-preview]
name=MySQL Tools Preview
baseurl=http://repo.mysql.com/yum/mysql-tools-preview/el/7/$basearch/
enabled=0
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql

[mysql-cluster-7.5-community]
name=MySQL Cluster 7.5 Community
baseurl=http://repo.mysql.com/yum/mysql-cluster-7.5-community/el/7/$basearch/
enabled=0
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql

[mysql-cluster-7.6-community]
name=MySQL Cluster 7.6 Community
baseurl=http://repo.mysql.com/yum/mysql-cluster-7.6-community/el/7/$basearch/
enabled=0
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-m

安装mysql5.7
yum -y install mysql-community-server
systemctl start mysqld
鉴于mysql5.7初始化后会有一个临时的密码,可以通过mysql的日志找到
tail -n 200 /var/log/mysqld.log | grep password | awk -F ':' '{print $4}'
mysql -uroot -p密码
注意这里的密码很可能包含特殊字符需要使用转义,比如
mysql -uroot -p+),NGiOWm3jq
修改本地root密码
登录进去后需要修改账号root@localhost的密码,否则无法进行数据库创建
set password for root@localhost = password('Qazwsx!23');
flush privileges;
使用mysql新密码登录验证密码是否修改成功
创建远程连接超级用户
grant all on . to 'root'@'%' identified by 'Qazwsx!23';
flush privileges;

创建cdh数据库
cat dbcreate.sh

#!/bin/bash

a=(scm amon rman hue metastore sentry nav navms oozie)
for i in ${a[@]}
do 
mysql -uroot -pQazwsx\!23 -e "create database $i DEFAULT CHARSET utf8 COLLATE utf8_general_ci;GRANT ALL ON *.* TO ${i}@'%' identified by 'Qazwsx\!23';"
done
mysql -uroot -pQazwsx\!23 -e "flush privileges;"

2.5.部署Cloudera Manager(cdh1)

安装server
yum install cloudera-manager-daemons cloudera-manager-server -y
安装mysql-connector-J
mkdir -p /usr/share/java
tar zvxf mysql-connector-java-5.1.46.tar.gz
cd /usr/share/java/mysql-connector-java-5.1.46
mv mysql-connector-java-5.1.46-bin.jar mysql-connector-java.jar
mv mysql-connector-java.jar ../
启动CDH server
/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
systemctl start cloudera-scm-server
查看是否启动成功
tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log,如果出现

2020-02-05 23:21:01,567 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty serve

表示启动成功
访问106.15.67.195:7180/
02cloud1

2.6.部署CDH Agent(cdh1 cdh2 cdh3)

cdh1上
yum -y install cloudera-manager-agent
cdh2和cdh3上
yum install cloudera-manager-agent cloudera-manager-daemons -y
修改配配置文件
vi /etc/cloudera-scm-agent/config.ini

server_host=cdh1   

systemctl start cloudera-scm-server
systemctl start cloudera-scm-agent

2.7.CDH组件安装配置

集群主机添加
02cloud2
安装parcels
02cloud3
检查
有两个警告

Transparent Huge Page Compaction is enabled and can cause significant performance problems. Run "echo never > /sys/kernel/mm/transparent_hugepage/defrag" and "echo never > /sys/kernel/mm/transparent_hugepage/enabled" to disable this, and then add the same command to an init script such as /etc/rc.local so it will be set on system reboot. The following hosts are affected:
View Details
cdh[1-3]

根据提示禁用大页内存防止造成性能问题,并将这两个命令放入系统启动脚本里
echo "echo never > /sys/kernel/mm/transparent_hugepage/defrag" >> /etc/rc.local
echo "echo never > /sys/kernel/mm/transparent_hugepage/enabled" >> /etc/rc.local

Starting with CDH 6, PostgreSQL-backed Hue requires the Psycopg2 version to be at least 2.5.4, see the documentation for more information. This warning can be ignored if hosts will not run CDH 6, or will not run Hue with PostgreSQL, or is not RHEL 6 compatible. The following hosts have an incompatible Psycopg2 version of '2.5.1':
View Details
cdh[1-3]
解决方式:本条可以忽略
yum -y install python-pip && pip install psycopg2==2.7.5 --ignore-installed

集群设置
选择自定义服务
勾选如下服务
02cloud4
角色指派,默认即可
数据库配置
02cloud5
群集审核,默认
首次安装
02cloud6
02cloud7
安装完成

2.8.问题

安装过程中会碰到一些问题,这里记录一下

Under-Replicated Blocks

因为我们的测试集群只有两个结点datanode ,所以需要先把dfs.replication 修改成2, 默认是3,集群从搭建好都没改过这个配置,所以所有的文件都是需要备份3份,但实际只能复制2份
解决方式:
cdh1上进行下列操作
su hdfs (这里需要将用户hdfs修改成有shell权限,修改/etc/passwd)
hadoop fs -ls
hadoop fs -chmod 777 /user
hadoop fs -setrep -R -w 2 /
TIP:事实上如果有大于等于3个数据节点就可以避免该问题
02cloud8

三、参考文章

CDH集群的安装配置