目录
前言
1. 相关安装文件
2. 虚拟机、mysql数据库安装
2.1 安装VMware 15 以及 linux 操作系统
2.2 安装mysql数据库
2.2.1安装环境准备
2.2.2 mysql 数据库安装
2.3 克隆centos虚拟机
3. 配置一主两从mysql集群模式
3.1配置主库
3.2 配置从库
4. 配置半同步复制
4.1 主库配置
4.2 从库配置
5.配置 MHA
5.1 MHA说明
5.2 配置MHA
5.2.1 安装新的Centos服务器,用于安装MHA
5.2.2 安装前配置
5.2.2 安装MHA-node
5.2.3 安装MHA-manager
5.2.4测试主备切换
5.2.5 恢复停掉的主库
前言
本文章是《拉勾高薪训练营》中的《mysql海量数据存储与优化》 的作业
参考博客:另一个同学写的mysql搭建主从模式高可用
参考博客:mha0.56版本安装使用排错
参考博客:VMware创建CentOS7虚拟机
参考博客:linux 使用ssh密钥实现免密码登录
1. 相关安装文件
- 虚拟机环境VMware15
- linux CentOS 7 64位系统镜像iso
- mysql安装包
- mha安装包
- SecureCRT
2. 虚拟机、mysql数据库安装
2.1 安装VMware 15 以及 linux 操作系统
(1)从网上下载 VMware 15安装包,安装虚拟机,这里就不做叙述
(2)打开VMware 15 创建虚拟机,按照提示安装 CentOS 7 操作系统
注意事项:笔者在安装CentOS 时,提示虚拟机打印机设备未启用,然后虚拟机黑屏,无法关闭虚拟机,无法重启。
在虚拟机页面(编辑>首选项>设备>虚拟机打印机>启用虚拟打印机),选上启用虚拟打印机后,就可用正常启用虚拟机了。如果有同学遇到相同的情况,在此提供一个参考
2.2 安装mysql数据库
2.2.1安装环境准备
(1)用SecureCRT连接VMware虚拟机
(2)在home下创建目录mysql,/home/mysql
(2)使用SeureCRT自带的SecureFX上传mysql安装包
2.2.2 mysql 数据库安装
(1) 确认linux是由有原装mysql,如果有的话,就删掉
[root@localhost mha]# rpm -qa | grep mariadb
mariadb-libs-5.5.41-2.el7_0.x86_64
[root@localhost mha]# rpm -e mariadb-libs-5.5.41-2.el7_0.x86_64 --nodeps
(2)进入到mysql安装包的上传目录/home/mysql,解压安装包,并执行安装命令
[root@localhost mysql]# cd /home/mysql/
[root@localhost mysql]# tar -xvf mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar
mysql-community-embedded-5.7.28-1.el7.x86_64.rpm
mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm
mysql-community-devel-5.7.28-1.el7.x86_64.rpm
mysql-community-embedded-compat-5.7.28-1.el7.x86_64.rpm
mysql-community-libs-5.7.28-1.el7.x86_64.rpm
mysql-community-test-5.7.28-1.el7.x86_64.rpm
mysql-community-common-5.7.28-1.el7.x86_64.rpm
mysql-community-embedded-devel-5.7.28-1.el7.x86_64.rpm
mysql-community-client-5.7.28-1.el7.x86_64.rpm
mysql-community-server-5.7.28-1.el7.x86_64.rpm
(3)解压后的组件不需要全部安装,安装其中的几个就可用
[root@localhost mysql] rpm -ivh mysql-community-common-5.7.28-1.el7.x86_64.rpm
[root@localhost mysql] rpm -ivh mysql-community-libs-5.7.28-1.el7.x86_64.rpm
[root@localhost mysql] rpm -ivh mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm
[root@localhost mysql] rpm -ivh mysql-community-client-5.7.28-1.el7.x86_64.rpm
[root@localhost mysql] rpm -ivh mysql-community-server-5.7.28-1.el7.x86_64.rpm
[root@localhost mysql] rpm -ivh mysql-community-devel-5.7.28-1.el7.x86_64.rpm
(4) 进行mysql初始化,初始化完成后,会给root用户生成一个默认密码,一般在/var/log/mysqld.log日志文件中的最下面
[root@localhost mysql] mysqld --initialize --user=mysql # --user 是可选的参数,意思是创建一个名称是 mysql 的用户
[root@localhost mysql] tail -f /var/log/mysqld.log
A temporary password is generated for root@localhost:decGRE#$
(5)启动mysql,并查看mysql的运行状态
[root@localhost mysql] systemctl start mysqld.service
[root@localhost mysql] systemctl status mysqld.service
(6)使用上面的密码登录mysql,并修改root密码
[root@localhost mysql] mysql -uroot -pdecGRE#$
mysql > set password=password('123456');
(7)开放3306端口或者关闭防火墙,因为我这里是测试环境,所以直接关闭了防火墙,在正式环境需要开放3306端口
[root@localhost mysql] systemctl stop firewalld # firewalld 防火墙,CentOS 自带
[root@localhost mysql] systemctl disable firewalld.service # 彻底一点,从开机启动中禁用掉
2.3 克隆centos虚拟机
因为我是练习环境,直接复制虚拟机
配置好mysql之后,直接拷贝两份虚拟机,这样就不需要再配置两次虚拟机环境了
(1) 右键点击已配置好的虚拟机>管理>克隆
(2) 需要注意,复制的虚拟机中要修改一下mysql的 /var/lib/mysql/auto.cnf,这个里面记录的是mysql的UUID,每个MySQL环境不能一样
(3)修复方式就是将这个文件删了,重启一下mysql就可以了
[root@localhost mysql]# systemctl stop mysqld.service
[root@localhost mysql]# mv /var/lib/mysql/auto.cnf /var/lib/mysql/auto.cnf.bak
[root@localhost mysql]# systemctl start mysqld.service
3. 配置一主两从mysql集群模式
3.1配置主库
(1)选一台虚拟机做为主数据库,修改配置文件/etc/my.cnf,保存退出
# log_bin
server-id=1 # 设置 server-id,每个数据库不能重复,必须
log_bin=mysql-bin # 指定 binlog 的名称,相当于开启 bin log, 必须
sync-binlog=1 # 开始刷新 bin log 到磁盘,每次有更新事务,完成后都要马上刷新到磁盘
binlog-ignore-db=performance_schema # binlog 中忽略的库
binlog-ignore-db=information_schema
binlog-ignore-db=sys
#binlog-do-db=lagou # 可以使用这个参数指定只同步那个库
(2)重启mysql
[root@localhost mysql]# systemctl restart mysqld
(3)登录进mysql,进行授权操作
mysql > grant replication slave on *.* to 'root'@'%' identified by 'root';
mysql > grant all privileges on *.* to 'root'@'%' identified by 'root';
mysql > flush privileges;
(4)查看一下master的状态,注意下面的File和Position参数,之后配置从库时需要用到
mysql> show master status;
+------------------+----------+--------------+-------------------------------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+-------------------------------------------+-------------------+
| mysql-bin.000003 | 154 | | performance_schema,information_schema,sys | |
+------------------+----------+--------------+-------------------------------------------+-------------------+
3.2 配置从库
(1)修改每个从库的配置文件/etc/my.cnf
server-id=3 # 注意一个集群环境中的 id 不能冲突
relay_log=mysql-relay-bin
# read_only=1 # 开启只读
(2)重启数据库,配置从库参数,需要注意两个参数
master_log_file:使用的是主库中bin log文件名,使用的是3.1.4的master状态中的File这一个参数的值
master_log_pos:主库bin log文件当前写的位置,使用的是3.1.4的master状态中的Position这一个参数的值
[root@localhost mysql]# systemctl restart mysqld
[root@localhost mysql]# mysql -uroot -proot
mysql > show slave status; # 查看 salve 状态,如果已经是开启状态,需要先 stop slave;修改配置后,在 start slave;
# 设置主库的 bin log 信息(用到了主库中当前 bin log 的文件名称以及当前写位置)
mysql > change master to master_host='192.168.122.128',master_port=3306,master_user='root',master_password='root',master_log_file='mysql-bin.000003',master_log_pos=154;
mysql > show slave status; # 查看从库状态, 第一次配置,此时连接 master 的状态应该是 No
mysql > start slave; # 开启 salve 模式
(3) 查看从库状态,证明从库连上了master
mysql> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.122.128
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000012
Read_Master_Log_Pos: 433
Relay_Log_File: mysql-relay-bin.000038
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000012
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 433
Relay_Log_Space: 1569
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: a9e183ca-dd0e-11ea-b0d5-000c2936acbb
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
(4) 经过上面的配置,主库中的所有DDL操作都会同步到 从库中
4. 配置半同步复制
说明:上面一主两从集群配置,主要的缺点是同步延迟,从mysql 5.7之后提供了半同步复制和并行复制,这里只配置了半同步复制,如果有需要配置并行复制的小伙伴,请参考另一个同学写的mysql搭建主从模式高可用中的并行复制配置
4.1 主库配置
(1)登录mysql数据库,查询当前数据库是否支持插件,笔者的mysql环境支持插件,如果各位有不支持的情况,请自行百度
[root@localhost mysql]# mysql -uroot -proot
mysql> select @@have_dynamic_loading;
+------------------------+
| @@have_dynamic_loading |
+------------------------+
| YES |
+------------------------+
(2)查看是否已安装插件rpl_semi_sync,在主节点,会把这个插件的名字写做rpl_semi_sync_master。我这里是已经安装了插件
mysql> show plugins;
+----------------------------+----------+--------------------+--------------------+---------+
| Name | Status | Type | Library | License |
+----------------------------+----------+--------------------+--------------------+---------+
| binlog | ACTIVE | STORAGE ENGINE | NULL | GPL |
| rpl_semi_sync_master | ACTIVE | REPLICATION | semisync_master.so | GPL |
+----------------------------+----------+--------------------+--------------------+---------+
(3)如果没有安装插件的话,执行安装插件命令
mysql> install plugin rpl_semi_sync_master soname 'semisync_master.so'; # soname 后面是个别名
(4)安装完成后,调整参数
参数说明:
rpl_semi_sync_master_enabled:主库是否开启半同步复制,1为ON,0为OFF
rpl_semi_sync_master_timeout:一个以毫秒为单位的值,用于控制主服务器等待来自从服务器的确认提交并恢复到异步复制的时间,超过这个值就是超时。 默认值是10000(10秒)。超时之后,就从半同步复制,返回到异步复制。
mysql> show variables like '%semi%';
+-------------------------------------------+------------+
| Variable_name | Value |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled | OFF |
| rpl_semi_sync_master_timeout | 10000 |
| rpl_semi_sync_master_trace_level | 32 |
| rpl_semi_sync_master_wait_for_slave_count | 1 |
| rpl_semi_sync_master_wait_no_slave | ON |
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
+-------------------------------------------+------------+
6 rows in set (0.00 sec)
mysql> set global rpl_semi_sync_master_enabled=1;
Query OK, 0 rows affected (0.00 sec)
mysql> set global rpl_semi_sync_master_timeout=1000;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like '%semi%';
+-------------------------------------------+------------+
| Variable_name | Value |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled | ON |
| rpl_semi_sync_master_timeout | 1000 |
| rpl_semi_sync_master_trace_level | 32 |
| rpl_semi_sync_master_wait_for_slave_count | 1 |
| rpl_semi_sync_master_wait_no_slave | ON |
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
+-------------------------------------------+------------+
6 rows in set (0.00 sec)
4.2 从库配置
(1)和主库一样,需要查看是否支持插件,以及是否安装了rpl_semi_sync插件,如果没有安装插件的话,需要安装rpl_semi_sync_slave
(2)安装rpl_semi_sync_slave
mysql> install plugin rpl_semi_sync_slave soname 'semisync_slave.so';
(3)配置从库的半同步参数
mysql> show variables like '%semi%';
+---------------------------------+-------+
| Variable_name | Value |
+---------------------------------+-------+
| rpl_semi_sync_slave_enabled | OFF |
| rpl_semi_sync_slave_trace_level | 32 |
+---------------------------------+-------+
2 rows in set (0.00 sec)
mysql> set global rpl_semi_sync_slave_enabled=1;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like '%semi%';
+---------------------------------+-------+
| Variable_name | Value |
+---------------------------------+-------+
| rpl_semi_sync_slave_enabled | ON |
| rpl_semi_sync_slave_trace_level | 32 |
+---------------------------------+-------+
2 rows in set (0.00 sec)
(4)重启一下从库的slave
mysql> stop slave;
Query OK, 0 rows affected (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
5.配置 MHA
5.1 MHA说明
MHA(Master High Availability)是一套比较成熟的 MySQL 高可用方案,也是一款优秀的故障切换和主从提升的高可用软件。
在MySQL故障切换过程中,MHA能做到在30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致性,以达到真正意义上的高可用。
MHA还支持在线快速将Master切换到其他主机,通常只需0.5-2秒。
5.2 配置MHA
5.2.1 安装新的Centos服务器,用于安装MHA
详细安装步骤参考上面的Centos 虚拟机安装
5.2.2 安装前配置
(1)配置/etc/hosts文件
需要在mysql一主两从,以MHA机器上配置好ip的别名。4台机器都要配置
[root@mysql-master ~]# cat /etc/host
cat: /etc/host: No such file or directory
[root@mysql-master ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.122.128 mysql-master
192.168.122.129 mysql-slave-1
192.168.122.130 mysql-slave-2
192.168.122.131 mha-manager
[root@mysql-master ~]#
(2)配置SSH免密钥登录
参考linux 使用ssh密钥实现免密码登录
注意事项:假如你的id_rsa.pub文件的内容以root@localhost.localdomain结尾。你需要将其改成机器的ip别名
例如在mysql-master这台机器上,就要改成root@mysql-master。否则配置的ssh免秘密登录不起作用
[root@mysql-master ~]# cat /root/.ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDX0pWml2nhDnO6Fkvf7BiEgHJWeHsSnKvw6pmaHezHST4RjvnB2BgW9yy+phPCm7RikORgDC7QywmsvVkHzQjdcIrFBy7+frl5AiZXthLDk+nYS1Vr8N2Wi7jMDNnvchyB/7GqyR9Lf7qKI1NdGV7XRnuFtSb8OP/4oAMA0moiI1xY+5j0gP53rZoZywp8FUWFnPY2I8EyaKQDaJNoVBEIRwDlA8fg6YVVKmhpOWkZbTvJ9yn0zMGgQy0RMMXg6Q/RZDBlqZtETR8kI30Hj6esIC3FTuCp5a+cf0cV8jcoCA4FOmQl0e1n46oTsH2mNMykJR//wGTvyOlspWU2lNnn root@localhost.localdomain
(3) 安装相关依赖包,所有的mysql节点和MHA节点都要安装
# 安装一个epel源
wget -O /etc/yum.repos.d/epel-7.repo http://mirrors.aliyun.com/repo/epel-7.repo
# 用yum安装依赖包 (CentOS 8 安装 perl-DBD-MySQL 即可)
yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y
(4) 修改两个从库配置文件/etc/my.cnf
relay_log_purge = 0 # 不自动删除relay log
log_bin=mysql-bin # 同样开启 bin log
# 因为主库中配置了 bin log 库过滤,从库也必须配置一样的库过滤
binlog-ignore-db=performance_schema
binlog-ignore-db=information_schema
binlog-ignore-db=sys
5.2.2 安装MHA-node
(1)在所有的数据库节点上安装MHA-node,一主两从节点以及mha-manager节点上都要安装
将文件mha4mysql-node-0.58-0.el7.centos.noarch.rpm上传到各个虚拟机上,执行安装命令
[root@mysql-master ~]# rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm
(2)mha授权,在各个节点上登录mysql进行授权
[root@mysql-master ~]# mysql -uroot -proot
mysql > grant all on *.* to 'mha'@'%' identified by 'mha';
5.2.3 安装MHA-manager
(1)将mha4mysql-manager-0.58-0.el7.centos.noarch.rpm上传到mha-manager节点上
(2)创建mha目录和配置文件
[root@mysql-master ~]# mkdir -p /etc/mha
[root@mysql-master ~]# touch /etc/mha/app1.cnf
(3)编辑app1.conf文件,添加以下内容,注意下面的hostname属性需要填各个mysql节点的ip地址
[server default]
#MHA日志名字
manager_log=/etc/mha/manager.log
#MHA的工作目录
manager_workdir=/etc/mha
#数据库binlog存放路径, 如果在 maser 库中自定义了 bin log 的目录,则需要在这里配置
#master_binlog_dir=/var/logs/
#mha管理用户的用户名, 全面所有数据库中已经给这个用户做了授权
user=root
#mha管理用户的密码
password=root
#监测心跳,每隔2秒监测一次(默认是3秒)
ping_interval=2
#ssh远程连接用户(做完免密的)
ssh_user=root
[server1]
hostname=192.168.122.128
port=3306
[server2]
hostname=192.168.122.129
port=3306
[server3]
hostname=192.168.122.130
port=3306
(4)测试ssh免密钥登录
[root@mha-manager ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf
Sat Aug 15 12:01:25 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Aug 15 12:01:25 2020 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Sat Aug 15 12:01:25 2020 - [info] Reading server configuration from /etc/mha/app1.cnf..
Sat Aug 15 12:01:25 2020 - [info] Starting SSH connection tests..
Sat Aug 15 12:01:26 2020 - [debug]
Sat Aug 15 12:01:25 2020 - [debug] Connecting via SSH from root@192.168.122.128(192.168.122.128:22) to root@192.168.122.129(192.168.122.129:22)..
Sat Aug 15 12:01:26 2020 - [debug] ok.
Sat Aug 15 12:01:26 2020 - [debug] Connecting via SSH from root@192.168.122.128(192.168.122.128:22) to root@192.168.122.130(192.168.122.130:22)..
Sat Aug 15 12:01:26 2020 - [debug] ok.
Sat Aug 15 12:01:27 2020 - [debug]
Sat Aug 15 12:01:26 2020 - [debug] Connecting via SSH from root@192.168.122.129(192.168.122.129:22) to root@192.168.122.128(192.168.122.128:22)..
Sat Aug 15 12:01:26 2020 - [debug] ok.
Sat Aug 15 12:01:26 2020 - [debug] Connecting via SSH from root@192.168.122.129(192.168.122.129:22) to root@192.168.122.130(192.168.122.130:22)..
Sat Aug 15 12:01:26 2020 - [debug] ok.
Sat Aug 15 12:01:28 2020 - [debug]
Sat Aug 15 12:01:26 2020 - [debug] Connecting via SSH from root@192.168.122.130(192.168.122.130:22) to root@192.168.122.128(192.168.122.128:22)..
Sat Aug 15 12:01:26 2020 - [debug] ok.
Sat Aug 15 12:01:26 2020 - [debug] Connecting via SSH from root@192.168.122.130(192.168.122.130:22) to root@192.168.122.129(192.168.122.129:22)..
Sat Aug 15 12:01:27 2020 - [debug] ok.
Sat Aug 15 12:01:28 2020 - [info] All SSH connection tests passed successfully.
(5)测试主从状态
[root@mha-manager ~]# masterha_check_repl --conf=/etc/mha/app1.cnf
Sat Aug 15 11:26:06 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Aug 15 11:26:06 2020 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Sat Aug 15 11:26:06 2020 - [info] Reading server configuration from /etc/mha/app1.cnf..
Sat Aug 15 11:26:06 2020 - [info] MHA::MasterMonitor version 0.58.
Sat Aug 15 11:26:07 2020 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.
at /usr/share/perl5/vendor_perl/MHA/DBHelper.pm line 208.
at /usr/share/perl5/vendor_perl/MHA/Server.pm line 166.
Sat Aug 15 11:26:07 2020 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.
at /usr/share/perl5/vendor_perl/MHA/DBHelper.pm line 208.
at /usr/share/perl5/vendor_perl/MHA/Server.pm line 166.
Sat Aug 15 11:26:07 2020 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.
at /usr/share/perl5/vendor_perl/MHA/DBHelper.pm line 208.
at /usr/share/perl5/vendor_perl/MHA/Server.pm line 166.
Sat Aug 15 11:26:08 2020 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln309] Got fatal error, stopping operations
Sat Aug 15 11:26:08 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Sat Aug 15 11:26:08 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sat Aug 15 11:26:08 2020 - [info] Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK!
(6)出现以上错误,百度后发现,是因为我将linux 自带的mariadb-libs删除的缘故,重新安装mariadb
[root@mha-manager ~]# yum install mariadb-server
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: mirror.bit.edu.cn
* extras: mirrors.cn99.com
* updates: mirrors.cn99.com
Resolving Dependencies
--> Running transaction check
---> Package mariadb-server.x86_64 1:5.5.65-1.el7 will be installed
--> Processing Dependency: mariadb-libs(x86-64) = 1:5.5.65-1.el7 for package: 1:mariadb-server-5.5.65-1.el7.x86_64
--> Processing Dependency: mariadb(x86-64) = 1:5.5.65-1.el7 for package: 1:mariadb-server-5.5.65-1.el7.x86_64
--> Running transaction check
---> Package mariadb.x86_64 1:5.5.65-1.el7 will be installed
---> Package mariadb-libs.x86_64 1:5.5.65-1.el7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
=============================================================================================================================================================
Package Arch Version Repository Size
=============================================================================================================================================================
Installing:
mariadb-server x86_64 1:5.5.65-1.el7 base 11 M
Installing for dependencies:
mariadb x86_64 1:5.5.65-1.el7 base 8.7 M
mariadb-libs x86_64 1:5.5.65-1.el7 base 759 k
Transaction Summary
=============================================================================================================================================================
Install 1 Package (+2 Dependent packages)
Total size: 21 M
Total download size: 20 M
Installed size: 111 M
Is this ok [y/d/N]: y
Downloading packages:
(1/2): mariadb-server-5.5.65-1.el7.x86_64.rpm | 11 MB 00:00:04
(2/2): mariadb-5.5.65-1.el7.x86_64.rpm | 8.7 MB 00:00:05
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 3.4 MB/s | 20 MB 00:00:05
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : 1:mariadb-libs-5.5.65-1.el7.x86_64 1/3
Installing : 1:mariadb-5.5.65-1.el7.x86_64 2/3
Installing : 1:mariadb-server-5.5.65-1.el7.x86_64 3/3
Verifying : 1:mariadb-server-5.5.65-1.el7.x86_64 1/3
Verifying : 1:mariadb-libs-5.5.65-1.el7.x86_64 2/3
Verifying : 1:mariadb-5.5.65-1.el7.x86_64 3/3
Installed:
mariadb-server.x86_64 1:5.5.65-1.el7
Dependency Installed:
mariadb.x86_64 1:5.5.65-1.el7 mariadb-libs.x86_64 1:5.5.65-1.el7
Complete!
(7)重新测试,测试成功
[root@mha-manager ~]# masterha_check_repl --conf=/etc/mha/app1.cnf
Sat Aug 15 11:38:40 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Aug 15 11:38:40 2020 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Sat Aug 15 11:38:40 2020 - [info] Reading server configuration from /etc/mha/app1.cnf..
Sat Aug 15 11:38:40 2020 - [info] MHA::MasterMonitor version 0.58.
Sat Aug 15 11:38:43 2020 - [info] GTID failover mode = 0
Sat Aug 15 11:38:43 2020 - [info] Dead Servers:
Sat Aug 15 11:38:43 2020 - [info] Alive Servers:
Sat Aug 15 11:38:43 2020 - [info] 192.168.122.128(192.168.122.128:3306)
Sat Aug 15 11:38:43 2020 - [info] 192.168.122.129(192.168.122.129:3306)
Sat Aug 15 11:38:43 2020 - [info] 192.168.122.130(192.168.122.130:3306)
Sat Aug 15 11:38:43 2020 - [info] Alive Slaves:
Sat Aug 15 11:38:43 2020 - [info] 192.168.122.129(192.168.122.129:3306) Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Sat Aug 15 11:38:43 2020 - [info] Replicating from 192.168.122.128(192.168.122.128:3306)
Sat Aug 15 11:38:43 2020 - [info] 192.168.122.130(192.168.122.130:3306) Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Sat Aug 15 11:38:43 2020 - [info] Replicating from 192.168.122.128(192.168.122.128:3306)
Sat Aug 15 11:38:43 2020 - [info] Current Alive Master: 192.168.122.128(192.168.122.128:3306)
Sat Aug 15 11:38:43 2020 - [info] Checking slave configurations..
Sat Aug 15 11:38:43 2020 - [info] read_only=1 is not set on slave 192.168.122.129(192.168.122.129:3306).
Sat Aug 15 11:38:43 2020 - [info] read_only=1 is not set on slave 192.168.122.130(192.168.122.130:3306).
Sat Aug 15 11:38:43 2020 - [info] Checking replication filtering settings..
Sat Aug 15 11:38:43 2020 - [info] binlog_do_db= , binlog_ignore_db= information_schema,performance_schema,sys
Sat Aug 15 11:38:43 2020 - [info] Replication filtering check ok.
Sat Aug 15 11:38:43 2020 - [info] GTID (with auto-pos) is not supported
Sat Aug 15 11:38:43 2020 - [info] Starting SSH connection tests..
Sat Aug 15 11:38:45 2020 - [info] All SSH connection tests passed successfully.
Sat Aug 15 11:38:45 2020 - [info] Checking MHA Node version..
Sat Aug 15 11:38:47 2020 - [info] Version check ok.
Sat Aug 15 11:38:47 2020 - [info] Checking SSH publickey authentication settings on the current master..
Sat Aug 15 11:38:47 2020 - [info] HealthCheck: SSH to 192.168.122.128 is reachable.
Sat Aug 15 11:38:48 2020 - [info] Master MHA Node version is 0.58.
Sat Aug 15 11:38:48 2020 - [info] Checking recovery script configurations on 192.168.122.128(192.168.122.128:3306)..
Sat Aug 15 11:38:48 2020 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000014
Sat Aug 15 11:38:48 2020 - [info] Connecting to root@192.168.122.128(192.168.122.128:22)..
Creating /var/tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /var/lib/mysql, up to mysql-bin.000014
Sat Aug 15 11:38:48 2020 - [info] Binlog setting check done.
Sat Aug 15 11:38:48 2020 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sat Aug 15 11:38:48 2020 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.122.129 --slave_ip=192.168.122.129 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.28-log --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=mysql-relay-bin.000045 --slave_pass=xxx
Sat Aug 15 11:38:48 2020 - [info] Connecting to root@192.168.122.129(192.168.122.129:22)..
Checking slave recovery environment settings..
Relay log found at /var/lib/mysql, up to mysql-relay-bin.000045
Temporary relay log file is /var/lib/mysql/mysql-relay-bin.000045
Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Sat Aug 15 11:38:49 2020 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.122.130 --slave_ip=192.168.122.130 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.28-log --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=mysql-relay-bin.000038 --slave_pass=xxx
Sat Aug 15 11:38:49 2020 - [info] Connecting to root@192.168.122.130(192.168.122.130:22)..
Checking slave recovery environment settings..
Relay log found at /var/lib/mysql, up to mysql-relay-bin.000038
Temporary relay log file is /var/lib/mysql/mysql-relay-bin.000038
Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Sat Aug 15 11:38:50 2020 - [info] Slaves settings check done.
Sat Aug 15 11:38:50 2020 - [info]
192.168.122.128(192.168.122.128:3306) (current master)
+--192.168.122.129(192.168.122.129:3306)
+--192.168.122.130(192.168.122.130:3306)
Sat Aug 15 11:38:50 2020 - [info] Checking replication health on 192.168.122.129..
Sat Aug 15 11:38:50 2020 - [info] ok.
Sat Aug 15 11:38:50 2020 - [info] Checking replication health on 192.168.122.130..
Sat Aug 15 11:38:50 2020 - [info] ok.
Sat Aug 15 11:38:50 2020 - [warning] master_ip_failover_script is not defined.
Sat Aug 15 11:38:50 2020 - [warning] shutdown_script is not defined.
Sat Aug 15 11:38:50 2020 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
(8)启动manager,并查看运行状态
[root@mha-manager ~]# nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /etc/mha/manager.log 2>&1 &
[1] 16239
[root@mha-manager ~]# cat /etc/mha/manager.log
Sat Aug 15 11:51:13 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Aug 15 11:51:13 2020 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Sat Aug 15 11:51:13 2020 - [info] Reading server configuration from /etc/mha/app1.cnf..
Sat Aug 15 11:51:13 2020 - [info] MHA::MasterMonitor version 0.58.
Sat Aug 15 11:51:14 2020 - [info] GTID failover mode = 0
Sat Aug 15 11:51:14 2020 - [info] Dead Servers:
Sat Aug 15 11:51:14 2020 - [info] Alive Servers:
Sat Aug 15 11:51:14 2020 - [info] 192.168.122.128(192.168.122.128:3306)
Sat Aug 15 11:51:14 2020 - [info] 192.168.122.129(192.168.122.129:3306)
Sat Aug 15 11:51:14 2020 - [info] 192.168.122.130(192.168.122.130:3306)
Sat Aug 15 11:51:14 2020 - [info] Alive Slaves:
Sat Aug 15 11:51:14 2020 - [info] 192.168.122.129(192.168.122.129:3306) Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Sat Aug 15 11:51:14 2020 - [info] Replicating from 192.168.122.128(192.168.122.128:3306)
Sat Aug 15 11:51:14 2020 - [info] 192.168.122.130(192.168.122.130:3306) Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Sat Aug 15 11:51:14 2020 - [info] Replicating from 192.168.122.128(192.168.122.128:3306)
Sat Aug 15 11:51:14 2020 - [info] Current Alive Master: 192.168.122.128(192.168.122.128:3306)
Sat Aug 15 11:51:14 2020 - [info] Checking slave configurations..
Sat Aug 15 11:51:14 2020 - [info] read_only=1 is not set on slave 192.168.122.129(192.168.122.129:3306).
Sat Aug 15 11:51:14 2020 - [info] read_only=1 is not set on slave 192.168.122.130(192.168.122.130:3306).
Sat Aug 15 11:51:14 2020 - [info] Checking replication filtering settings..
Sat Aug 15 11:51:14 2020 - [info] binlog_do_db= , binlog_ignore_db= information_schema,performance_schema,sys
Sat Aug 15 11:51:14 2020 - [info] Replication filtering check ok.
Sat Aug 15 11:51:14 2020 - [info] GTID (with auto-pos) is not supported
Sat Aug 15 11:51:14 2020 - [info] Starting SSH connection tests..
Sat Aug 15 11:51:16 2020 - [info] All SSH connection tests passed successfully.
Sat Aug 15 11:51:16 2020 - [info] Checking MHA Node version..
Sat Aug 15 11:51:17 2020 - [info] Version check ok.
Sat Aug 15 11:51:17 2020 - [info] Checking SSH publickey authentication settings on the current master..
Sat Aug 15 11:51:17 2020 - [info] HealthCheck: SSH to 192.168.122.128 is reachable.
Sat Aug 15 11:51:17 2020 - [info] Master MHA Node version is 0.58.
Sat Aug 15 11:51:17 2020 - [info] Checking recovery script configurations on 192.168.122.128(192.168.122.128:3306)..
Sat Aug 15 11:51:17 2020 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000014
Sat Aug 15 11:51:17 2020 - [info] Connecting to root@192.168.122.128(192.168.122.128:22)..
Creating /var/tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /var/lib/mysql, up to mysql-bin.000014
Sat Aug 15 11:51:17 2020 - [info] Binlog setting check done.
Sat Aug 15 11:51:17 2020 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sat Aug 15 11:51:17 2020 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.122.129 --slave_ip=192.168.122.129 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.28-log --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=mysql-relay-bin.000045 --slave_pass=xxx
Sat Aug 15 11:51:17 2020 - [info] Connecting to root@192.168.122.129(192.168.122.129:22)..
Checking slave recovery environment settings..
Relay log found at /var/lib/mysql, up to mysql-relay-bin.000045
Temporary relay log file is /var/lib/mysql/mysql-relay-bin.000045
Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Sat Aug 15 11:51:18 2020 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.122.130 --slave_ip=192.168.122.130 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.28-log --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=mysql-relay-bin.000038 --slave_pass=xxx
Sat Aug 15 11:51:18 2020 - [info] Connecting to root@192.168.122.130(192.168.122.130:22)..
Checking slave recovery environment settings..
Relay log found at /var/lib/mysql, up to mysql-relay-bin.000038
Temporary relay log file is /var/lib/mysql/mysql-relay-bin.000038
Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Sat Aug 15 11:51:18 2020 - [info] Slaves settings check done.
Sat Aug 15 11:51:18 2020 - [info]
192.168.122.128(192.168.122.128:3306) (current master)
+--192.168.122.129(192.168.122.129:3306)
+--192.168.122.130(192.168.122.130:3306)
Sat Aug 15 11:51:18 2020 - [warning] master_ip_failover_script is not defined.
Sat Aug 15 11:51:18 2020 - [warning] shutdown_script is not defined.
Sat Aug 15 11:51:18 2020 - [info] Set master ping interval 2 seconds.
Sat Aug 15 11:51:18 2020 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Sat Aug 15 11:51:18 2020 - [info] Starting ping health check on 192.168.122.128(192.168.122.128:3306)..
Sat Aug 15 11:51:18 2020 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
[root@mha-manager ~]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:16239) is running(0:PING_OK), master:192.168.122.128
5.2.4测试主备切换
(1)切换到主库的环境上
(2)执行systemctl stop mysqld
[root@mha-manager ~]# systemctl stop mysqld
(3)等待5-10秒后查看从库状态,show slave status \G;会发现一个从库成为了master节点
5.2.5 恢复停掉的主库
现在恢复只能先已从库的角色恢复,如需要在切换成主库,可以手动切换。
-
原主库中启动数据库 systemctl start mysqld
-
在 MHA Manager 机器中,查看 MHA 日志,找到以下内容:
[root@mha-manager mha]# grep -i 'change master to' /etc/mha/manager.log
Sat Aug 15 17:17:31 2020 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.122.129', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=433, MASTER_USER='root', MASTER_PASSWORD='xxx';
-
登录原主库,执行日志中的 change master
命令,注意密码部分要调整,然后启动 slave,在查看 slave 状态:这样主库就切换到了192.168.122.129,192.168.122.128切换为从库
mysql> CHANGE MASTER TO MASTER_HOST='192.168.122.129', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=433, MASTER_USER='root', MASTER_PASSWORD='root';
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.122.129
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 736
Relay_Log_File: mysql-master-relay-bin.000002
Relay_Log_Pos: 623
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 736
Relay_Log_Space: 837
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: f3de6efe-dd2c-11ea-8f36-000c296a16b3
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
-
重新修复 MHA Manager 配置文件,因为完成一次切换后,MHA 会自动把失效的主库信息从配置文件移除掉并且自动退出,所以需要把上面修复好的数据库信息补充到 MHA 配置文件中,然后重新启动 MHA Manager。
5.2.6 手动热切换主库
如果要在主从都运行正常情况下切换主从,可使用一下命令:
> masterha_master_switch --conf=/etc/mha/app1.cnf --master_state=alive --new_master_host=192.168.122.129 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000