本文MGR部分参考自GreatSQL
一、安装 MySQL
1、卸载 mariadb 及安装依赖
# yum -y remove mariadb*
# yum -y install wget cmake gcc gcc-c++ ncurses ncurses-devel libaio-devel openssl openssl-devel
2、下载软件包
下载地址:https://downloads.mysql.com/archives/community/ 。
# cd /usr/local
# wget https://dev.mysql.com/get/Downloads/MySQL-8.0/mysql-8.0.35-linux-glibc2.17-x86_64.tar.xz
# tar xf mysql-8.0.35-linux-glibc2.17-x86_64.tar.xz
# cd mysql-8.0.35-linux-glibc2.17-x86_64
# ls
bin docs include lib LICENSE man README share support-files
# ln -s /usr/local/mysql-8.0.35-linux-glibc2.17-x86_64 /usr/local/mysql
glibc 版本选择方法,MySQL 软件包的 glibc 版本要与所在服务器版本一致,否则可能会出现兼容性问题
# ldd --version
ldd (GNU libc) 2.17
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
二、配置MySQL Server
1、创建数据目录
# useradd -s /sbin/nologin mysql
# mkdir -p /data/mysql/{data,logs,tmp}
# mkdir -p /var/lib/mysql/
# chown -R mysql.mysql /usr/local/mysql/
# chown -R mysql.mysql /data/mysql
# chown -R mysql:mysql /var/lib/mysql/
# echo "export PATH=/usr/local/mysql/bin:$PATH" >> /etc/profile
# source /etc/profile
# mysql -V
mysql Ver 8.0.35 for Linux on x86_64 (MySQL Community Server - GPL)
2、准备配置文件
# vim /etc/my.cnf
#
# my.cnf example for MySQL
#
[mysqld]
datadir=/data/mysql/data
socket=/var/lib/mysql/mysql.sock
log-error=/data/mysql/logs/error.log
pid-file=/data/mysql/mysqld.pid
default-storage-engine=INNODB
character_set_server=utf8mb4
lower_case_table_names=1
table_open_cache=128
max_connections=2000
max_connect_errors=6000
innodb_file_per_table=1
innodb_buffer_pool_size=1G
max_allowed_packet=64M
innodb_flush_method=O_DIRECT
innodb_lock_wait_timeout=1800
innodb_flush_log_at_trx_commit=0
sync_binlog=0
group_concat_max_len=1024000
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
skip-name-resolve
[client]
socket=/var/lib/mysql/mysql.sock
3、初始化
# mysqld --initialize --user=mysql --basedir=/usr/local/mysql --datadir=/data/mysql/data
注意:不要在生产环境中使用 --initialize-insecure
选项进行初始化安装,因为这么做的话,超级管理员root账号默认是空密码,任何人都可以使用该账号登录数据库,存在安全风险。
获取初始密码
# localhost : 后面的即为临时密码
cat /data/mysql/logs/error.log | grep localhost
4、systemd 管理 MySQL 服务
# cat >/usr/lib/systemd/system/mysqld.service <<EOF
[Unit]
Description=MySQL Server
Documentation=man:mysqld(8)
Documentation=http://dev.mysql.com/doc/refman/en/using-systemd.html
After=network.target
After=syslog.target
[Install]
WantedBy=multi-user.target
[Service]
User=mysql
Group=mysql
Type=notify
TimeoutSec=0
PermissionsStartOnly=true
ExecStart=/usr/local/mysql/bin/mysqld --defaults-file=/etc/my.cnf
LimitNOFILE = 10000
Restart=on-failure
RestartPreventExitStatus=1
Environment=MYSQLD_PARENT_PID=1
PrivateTmp=false
EOF
# systemctl start mysqld
# systemctl enable mysqld
# systemctl stop mysqld
参数 | 说明 |
---|---|
Description=MySQL Server | 描述服务的内容,这里是 MySQL 服务器 |
Documentation | 文档链接,指向 mysqld 的 man 页面 |
Documentation | 另一个文档链接,指向 MySQL 的 systemd 使用说明 |
After=network.target | 表示服务在网络服务之后启动 |
After=syslog.target | 表示服务在 syslog 服务之后启动 |
WantedBy=multi-user.target | 表示该服务是多用户目标的一部分 |
User=mysql | 服务运行的用户是 mysql |
Group=mysql | 服务运行在 mysql 组中 |
Type=notify | 进程启动后通知 systemd,然后等待进程退出 |
TimeoutSec=0 | 如果 mysqld 启动失败,systemd 将不会等待 mysqld 超时 |
PermissionsStartOnly=true | 只考虑运行目录的权限 |
ExecStart | 启动 mysqld 的命令及其参数 |
LimitNOFILE = 10000 | 限制打开的文件描述符数量为 10000 |
Restart=on-failure | 如果 mysqld 进程退出,则重新启动它 |
RestartPreventExitStatus=1 | 如果 mysqld 进程正常退出,不尝试重新启动 |
Environment=MYSQLD_PARENT_PID=1 | 设置环境变量 MYSQLD_PARENT_PID 为 1 |
PrivateTmp=false | 表示 tmp 目录对其他用户是可见的 |
MYSQLD_PARENT_PID=1 | 表示 MySQL 服务器的父进程的进程ID |
5、修改初始密码
# mysql -uroot -p
mysql> ALTER USER 'root'@'localhost' IDENTIFIED BY '你的密码';
# 刷新权限
mysql> flush privileges;
三、MySQL 主从复制
1、主库配置
1、修改my.cnf
# vim /etc/my.cnf
[mysqld]
datadir=/data/mysql/data
socket=/var/lib/mysql/mysql.sock
log-error=/data/mysql/logs/error.log
pid-file=/data/mysql/mysqld.pid
default-storage-engine=INNODB
character_set_server=utf8mb4
lower_case_table_names=1 # 不区分大小写
performance_schema=off
table_open_cache=128
max_connections=1000
max_connect_errors=6000
max_allowed_packet=64M
innodb_file_per_table=1
innodb_buffer_pool_size=512M
innodb_flush_method=O_DIRECT
innodb_lock_wait_timeout=1800
character-set-client-handshake=FALSE
character-set-server=utf8mb4
collation-server=utf8mb4_general_ci
init_connect='SET default_collation_for_utf8mb4=utf8mb4_general_ci'
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
skip-name-resolve
# 主从复制-主机配置
# 主服务器唯一ID
server-id=1
# 启用二进制日志
log-bin=mysql-bin
# 设置不要复制的数据库(可设置多个)
binlog-ignore-db=sys
binlog-ignore-db=mysql
binlog-ignore-db=information_schema
binlog-ignore-db=performance_schema
# 设置需要复制的数据库(可设置多个)
# binlog-do-db=test1
# 为每个session分配的内存,在事务过程中用来存储二进制日志的缓存
binlog_cache_size=1M
# 主从复制的格式(mixed,statement,row,默认格式是statement。建议是设置为row,主从复制时数据更加能够统一)
binlog_format=row
# 配置二进制日志自动删除/过期时间,单位秒,默认值为2592000,即30天;8.0.3版本之前使用expire_logs_days,单位天数,默认值为0,表示不自动删除。
binlog_expire_logs_seconds=2592000
[client]
socket=/var/lib/mysql/mysql.sock
2、重启 MySQL
# systemctl restart mysqld
# systemctl status mysqld
3、创建主从复制用户
#创建主从复制用户
CREATE USER 'clusteruser'@'%' IDENTIFIED WITH mysql_native_password BY '你的密码';
# 授权
GRANT REPLICATION SLAVE ON *.* TO 'clusteruser'@'%';
# 刷新
FLUSH PRIVILEGES;
2、从库配置
1、配置my.cnf
# vim /etc/my.cnf
[mysqld]
datadir=/data/mysql/data
socket=/var/lib/mysql/mysql.sock
log-error=/data/mysql/logs/error.log
pid-file=/data/mysql/mysqld.pid
default-storage-engine=INNODB
character_set_server=utf8mb4
lower_case_table_names=1 # 不区分大小写
performance_schema=off
table_open_cache=128
max_connections=1000
max_connect_errors=6000
max_allowed_packet=64M
innodb_file_per_table=1
innodb_buffer_pool_size=512M
innodb_flush_method=O_DIRECT
innodb_lock_wait_timeout=1800
character-set-client-handshake=FALSE
character-set-server=utf8mb4
collation-server=utf8mb4_general_ci
init_connect='SET default_collation_for_utf8mb4=utf8mb4_general_ci'
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
skip-name-resolve
# 主从复制-主机配置
# 从服务器唯一ID
server-id=2
# 需要主从复制的数据库 ,如多个则重复配置
# replicate-do-db=test1
# 复制过滤:也就是指定哪个数据库不用同步(mysql库一般不同步) ,如多个则重复配置
binlog-ignore-db=sys
binlog-ignore-db=mysql
binlog-ignore-db=information_schema
binlog-ignore-db=performance_schema
# 为每个session分配的内存,在事务过程中用来存储二进制日志的缓存
binlog_cache_size=1M
# 主从复制的格式(mixed,statement,row,默认格式是statement。建议是设置为row,主从复制时数据更加能够统一)
binlog_format=row
# relay_log配置中继日志,默认采用 主机名-relay-bin 的方式保存日志文件
relay_log=replicas-mysql-relay-bin
# log_replica_updates表示slave是否将复制事件写进自己的二进制日志,默认值ON开启;8.0.26版本之前使用log_slave_updates
log_replica_updates=ON
# 防止改变数据(只读操作,除了特殊的线程)
read_only=ON
[client]
socket=/var/lib/mysql/mysql.sock
2、重启 MySQL
# systemctl restart mysqld
# systemctl status mysqld
3、配置主从连接命令
mysql> CHANGE REPLICATION SOURCE TO SOURCE_HOST='192.168.1.10',SOURCE_PORT=3306,SOURCE_USER='clusteruser',SOURCE_PASSWORD='你的密码',SOURCE_LOG_FILE='mysql-bin.000002',SOURCE_LOG_POS=833;
mysql> start slave; # 启动主从复制
参数 | 说明 |
---|---|
MASTER_HOST/SOURCE_HOST | 主数据库的主机ip |
MASTER_PORT/SOURCE_PORT | 主数据库的端口,不设置则默认是3306 |
MASTER_USER/SOURCE_USER | 主数据库被授予同步复制权限的用户名 |
MASTER_PASSWORD/SOURCE_PASSWORD | 对应的用户密码 |
MASTER_LOG_FILE/SOURCE_LOG_FILE | 在主数据库执行命令show master status 查询到的二进制日志文件名称 |
MASTER_LOG_POS/SOURCE_LOG_POS | 在主数据库执行命令show master status 查询到的位置 Position的值 |
4、开启从节点
# 查看从节点状态
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 192.168.1.10
Master_User: clusteruser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 833
Relay_Log_File: replicas-mysql-relay-bin.000002
Relay_Log_Pos: 326
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 833
Relay_Log_Space: 545
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: b302ce7f-b1b5-11ee-86c5-000c29479e2c
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
mysql> stop slave; # 停止主从复制
mysql> reset slave; # 重置主从复制
四、MySQL 高可用
1、现有的高可用方案
1、MySQL InnoDB ReplicaSet
MySQL InnoDB ReplicaSet,即InnoDB副本集,是一组由AdminAPI管理的一组使用基于GTID的异步复制的MySQL实例组成的集群集合。它由单主(服务器实例)和多辅助(服务器实例)组成(在MySQL复制中通常叫做源和副本)。
这个副本集通过AdminAPI进行管理,并且与MySQL Shell和MySQL Router紧密集成,可以使用它们进行自动配置。
与InnoDB Cluster相比,InnoDB ReplicaSet不提供高可用性,但它适用于在不需要高可用的使用场景中横向扩展读取能力,并提供手动故障转移功能。
此外,InnoDB ReplicaSet底层基于MySQL异步复制技术,适用于MySQL服务器实例部署在广域网(WAN)上,彼此通过异步复制通道连接,辅助实例可容忍更大程度的复制延迟,且不需要在事务上达成共识的场景。
2、MGR
MySQL Group Replication是MySQL官方开发的一个开源插件,是一个高可用、高扩展的MySQL集群服务。其特点是基于原生复制和Paxos协议的组复制技术,以插件方式提供一致数据安全保证。
相比异步复制而言,半同步复制提高了MySQL集群的可靠性。Group Replication是MySQL复制今后发展的方向,相比前两者,可靠性更好,在易用性上也有巨大提高。
Group Replication内部集成了组管理服务,实现了很多组内成员的自动化管理功能,使得Group Replication的使用和管理变得非常简单。
此外,Group Replication具有高一致性、高容错性、高扩展性和高灵活性等特点。其中,高一致性是指基于原生复制及Paxos协议的组复制技术;高容错性是指大多数服务正常就可继续工作,自动不同节点检测资源征用冲突,按顺序优先处理,内置自动防脑裂机制;高扩展性是指自动添加移除节点,并更新组信息;高灵活性是指单主模式和多主模式,单主模式自动选主,所有更新操作在主进行;多主模式则所有server同时更新。
3、MySQL NDB Cluster
MySQL NDB Cluster是一种内存存储引擎,主要用于MySQL Cluster分布式集群环境。Cluster是MySQL从5.0版本才开始提供的新功能。
NDB(也叫NDB Cluster或者NDBCLUSTER)存储引擎可以与一系列故障切换和负载平衡选项进行配置,但其最简单的形式是在集群级别的存储引擎。NDB Cluster的NDB存储引擎包含一整套数据,仅依赖于集群内部的其他数据。
在NDB群集中,群集的每个部分都被视为一个节点。节点有三种类型的集群节点,在最小的NDB集群配置中,将至少有三个节点,每种节点中的一种:管理节点、数据节点、一个或多个管理服务器,以及可能的其他专用数据访问程序。
NDB Cluster是一种能够在无共享系统中对内存数据库进行集群的技术。无共享架构使系统能够使用非常便宜的硬件,并且对硬件或软件的特定要求最低。
4、PXC
Percona XtraDB Cluster是一个基于Galera 2.x库的事务型应用通用的多主同步复制插件,主要用于解决强一致性问题,使各个节点之间的数据保持实时同步以及实现多节点同时读写。
Percona XtraDB Cluster完全兼容MySQL和Percona Server,提供同步复制、多主复制、在从服务器上并行应用事件等特性。它可以在MySQL或Percona Server中创建数据库,应用程序不需要更改或改动很小。
此外,Percona XtraDB Cluster集群由节点组成,推荐配置至少3个节点,但也可以运行在2个节点上。每个节点都是普通的mysql/percona服务器,可以将现有的数据库服务器组成集群,反之,也可以将集群拆分成单独的服务器。每个节点都包含完整的数据副本。
5、MHA
MHA(Master High Availability)是一套比较成熟的 MySQL 高可用方案,也是一个优秀的故障切换和主从提升的高可用软件。MHA 由两部分组成:MHA Manager(管理节点)和 MHA Node(数据节点)。
MHA Manager 可以单独部署在一台独立的机器上管理多个 master-slave 集群,也可以部署在一台 slave 节点上。负责检测 master 是否宕机、控制故障转移、检查 MySQL 复制状况等。
MHA 的出现就是为了解决 MySQL 单点的问题,在 MySQL 故障切换过程中,MHA 能做到在30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA 能在最大程度上保证数据的一致性,以达到真正意义上的高可用。
此外,MHA 还支持在线快速将 Master 切换到其他主机,通常只需0.5-2秒。
2、优缺点
MySQL Group Replication:
- 优点:
- 基于原生复制和Paxos协议的组复制技术,确保数据一致性。
- 高容错性,自动检测机制可以在节点出现宕机时剔除问题节点,其他节点可以正常使用。
- 高扩展性,可以随时在线新增和移除节点,自动同步所有节点上状态。
- 高灵活性,支持单主模式和多主模式,单主模式下只有主库可以读写,其他从库会加上super_read_only状态,只能读取不可写入。
- 缺点:
- 技术较新,不太稳定,暂时性能略差于PXC。
- 对网络稳定性要求很高,至少是同机房做。
MySQL InnoDB ReplicaSet:
- 优点:
- 提供更好的写性能。
- 使用MySQL Shell提供更加简单、方便、快捷的管理方式。
- 缺点:
- 所有管理操作必须要通过MySQL Shell来完成,不能直连到主实例上配置系统变量。
MySQL NDB Cluster:
- 优点:
- 在任意节点进行写操作。
- 在从服务器上并行应用事件,真正意义上的并行复制。
- 良好的读负载扩展,任意节点都可以查询。
- 缺点:
- 加入新节点,开销大。需要复制完整的数据。
- 不能有效的解决写缩放问题,所有的写操作都将发生在所有节点上。
- 有多少个节点就有多少重复的数据。
Percona XtraDB Cluster:
- 优点:
- 提供同步复制和多主复制功能。
- 在本地节点上执行查询,无需远程访问。
- 无需集中管理,可以在任何时间点失去任何节点,但集群仍可正常工作。
- 缺点:
- 加入新节点时开销大,需要复制完整的数据。
Master High Availability (MHA):
- 优点:
- 当主DB不可用时,从多个从服务器中选举出来新的主DB。提供主从切换和故障转移功能,在线故障转移时不易丢失数据。同一个监控节点可以监控多个集群。
- 可以支持基于GTID的复制模式。
- 缺点:
- 需要编写脚本或利用第三方工具来实现VIP的配置。MHA启动后只监控主服务器是否可用,没办法监控从服务器。需要基于SSH免认证登陆配置,存在一定的安全隐患。没有提供从服务器的读负载均衡功能。
- 大多数云服务无法提供 VIP
五、利用MySQL Shell构建MGR集群
1、技术原理
1、什么是MGR
MGR是以Plugin方式嵌入MySQL,部署更灵活方便。
事务从Server层通过钩子(hook)进入MGR API接口层,再分发到各组件层,在组件层完成事务Capture/Apply/Recover,通过复制协议层(Replication Protocol Logics)传输事务,最后经由GCS协调事务在各节点的最终一致性。
MGR节点间由组通信系统(GCS)提供支持,它提供了故障检测机制、组成员角色管理,以及安全且有序的消息传递,这些机制可确保在各节点间一致地复制数据。这项技术的核心是Paxos算法的实现,在MySQL里称之为XCom,由它充当MGR的通信引擎。
对于要提交的事务,组中的多数派节点必须就全局事务序列中给定的事务顺序达成一致。各节点做出决定提交或中止事务的选择,但所有节点都要做出相同的决定。如果发生网络分区,导致节点间无法达成一致决定,则在网络恢复前,MGR无法工作。
MGR支持单主和多主两种模式,在单主模式下,各节点会自动选定主节点,只有该主节点能同时读写,而其他(从)节点只能只读。在多主模式下,所有节点都可以进行读写。
MGR是基于分布式的Paxos算法实现,因此要求有多数派节点存活以保证投票。这就决定了在不影响系统整体可用性前提下,可容忍发生故障的节点数量。假设总节点数是n,可容忍发生故障的节点数是f,则它们的关系是:n = 2*f + 1。简言之,容忍发生故障的节点数,不高于总节点数的一半。
下表展示了不同节点数的对应关系:
总节点数 | 多数派节点数 | 最大容忍故障节点数 |
---|---|---|
1 | 1 | 0 |
2 | 2 | 0 |
3 | 2 | 1 |
4 | 3 | 1 |
5 | 3 | 2 |
6 | 4 | 2 |
7 | 4 | 3 |
8 | 5 | 3 |
9 | 5 | 4 |
2、MGR使用建议
在使用MGR时,有以下几个建议:
- 不同版本不要混用,尤其是不同大版本不要混用,要尽快完成升级。
- 对同一个表的DDL和DML都只在同一个节点,否则可能会造成节点意外退出MGR。
- 不要跑大事务,每个事务尽量控制在10MB以内。
3、启用MGR的一些先决条件
想要启用MGR,需要先满足几个先决条件:
- 每个节点都必须启用binlog,而且使用row格式。
- 每个节点都要启用binlog转储,即
log_slave_updates=1
或log_replica_updates=1
。 - 每个节点的
server_id
及server_uuid
不能相同。 - 在8.0.20之前,要求
binlog_checksum=NONE
,但是从8.0.20后,可以设置binlog_checksum=CRC32
。 - 要求启用 GTID,即设置
gtid_mode=ON
。 - 要求
master_info_repository=TABLE
及relay_log_info_repository=TABLE
,不过从MySQL 8.0.23开始,这两个选项已经默认设置TABLE,因此无需再单独设置。 - 所有节点上的表名大小写参数
lower_case_table_names
设置要求一致。 - 只支持InnoDB表,其余表虽然也能创建,但无法同步数据(Primary节点上写入数据时会报错)。
- 必须要有主键,如果没有主键(或者可以被选中作为聚集索引的辅助索引),则写入数据时会报错。
- 建议启用writeset模式,即设置以下几个参数
slave_parallel_type = LOGICAL_CLOCK
slave_parallel_workers = N
,N>0,可以设置为逻辑CPU数的2倍binlog_transaction_dependency_tracking = WRITESET
slave_preserve_commit_order = 1
slave_checkpoint_period = 2
4、MGR相关约束
下面是关于MGR使用的一些限制:
- 所有表必须是InnoDB引擎。可以创建非InnoDB引擎表,但无法写入数据,在利用Clone构建新节点时也会报错(在GreatSQL中,可以设置选项
enforce_storage_engine = InnoDB
只允许使用InnoDB引擎,而禁用其他引擎)。 - 所有表都必须要有主键。同上,能创建没有主键的表,但无法写入数据,在利用Clone构建新节点时也会报错。
- 尽量不要使用大事务,默认地,事务超过150MB会报错,最大可支持2GB的事务(在GreatSQL未来的版本中,会增加对大事务的支持,提高大事务上限,但依然不建议运行大事务)。
- 如果是从旧版本进行升级,则不能选择 MINIMAL 模式升级,建议选择 AUTO 模式,即
upgrade=AUTO
。 - 由于MGR的事务认证线程不支持
gap lock
,因此建议把所有节点的事务隔离级别都改成READ COMMITTED
。基于相同的原因,MGR集群中也不要使用table lock
及name lock
(即GET_LOCK()
函数 )。 - 在多主(
multi-primary
)模式下不支持串行(SERIALIZABLE
)隔离级别。 - 不支持在不同的MGR节点上,对同一个表分别执行DML和DDL,可能会造成数据丢失或节点报错退出。
- 在多主(
multi-primary
)模式下不支持多层级联外键表。另外,为了避免因为使用外键造成MGR报错,建议设置group_replication_enforce_update_everywhere_checks=ON
。 - 在多主(
multi-primary
)模式下,如果多个节点都执行SELECT ... FOR UPDATE
后提交事务会造成死锁。 - 不支持复制过滤(Replication Filters)设置。
看起来限制有点多,但绝大多数时候并不影响正常的业务使用。
此外,想要启用MGR还有几个要求:
- 每个节点都要启用binlog。
- 每个节点都要转存binlog,即设置
log_slave_updates=1
。 - binlog format务必是row模式,即
binlog_format=ROW
。 - 每个节点的
server_id
及server_uuid
不能相同。 - 在8.0.20之前,要求
binlog_checksum=NONE
,但是从8.0.20后,可以设置binlog_checksum=CRC32
。 - 要求启用 GTID,即设置
gtid_mode=ON
。 - 要求
master_info_repository=TABLE
及relay_log_info_repository=TABLE
,不过从MySQL 8.0.23开始,这两个选项已经默认设置TABLE,因此无需再单独设置。 - 所有节点上的表名大小写参数
lower_case_table_names
设置要求一致。 - 最好在局域网内部署MGR,而不要跨公网,网络延迟太大的话,会导致MGR性能很差或很容易出错。
- 建议启用writeset模式,即设置以下几个参数
slave_parallel_type = LOGICAL_CLOCK
slave_parallel_workers = N
,N>0,可以设置为逻辑CPU数的2倍binlog_transaction_dependency_tracking = WRITESET
slave_preserve_commit_order = 1
slave_checkpoint_period = 2
2、初始化节点
1、安装准备
准备好下面三台服务器:
IP | 端口 | 角色 |
---|---|---|
192.168.1.10 | 3306 | mgr1 |
192.168.1.20 | 3306 | mgr2 |
192.168.1.30 | 3306 | mgr3 |
确保三个节点间的网络是可以互通的,并且没有针对3306和33061端口的防火墙拦截规则。
2、初始化MGR第一个节点
查看本机UUID
# cat /proc/sys/kernel/random/uuid
60783090-afee-4636-8f0c-fd1c534c351a
接下来准备初始化MGR的第一个节点,也称之为 引导节点。
修改 /etc/my.cnf
,增加以下几行和MGR相关的配置参数:
[mysqld]
datadir=/data/mysql/data
socket=/var/lib/mysql/mysql.sock
log-error=/data/mysql/logs/error.log
pid-file=/data/mysql/mysqld.pid
server-id=1
default-storage-engine=INNODB
character_set_server=utf8mb4
lower_case_table_names=1 # 不区分大小写
table_open_cache=128
max_connections=1000
max_connect_errors=6000
max_allowed_packet=64M
innodb_file_per_table=1
innodb_buffer_pool_size=512M
innodb_flush_method=O_DIRECT
innodb_lock_wait_timeout=1800
character-set-client-handshake=FALSE
character-set-server=utf8mb4
collation-server=utf8mb4_general_ci
init_connect='SET default_collation_for_utf8mb4=utf8mb4_general_ci'
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
skip-name-resolve
# MGR
# 加载MGR插件
loose-plugin_load_add = 'mysql_clone.so'
loose-plugin_load_add = 'group_replication.so'
loose-group_replication_group_name = "60783090-afee-4636-8f0c-fd1c534c351a"
loose-group_replication_local_address = "192.168.1.10:33061"
loose-group_replication_group_seeds = '192.168.1.10:33061,192.168.1.20:33061,192.168.1.30:33061'
loose-group_replication_start_on_boot = ON
loose-group_replication_bootstrap_group = OFF
loose-group_replication_exit_state_action = READ_ONLY
loose-group_replication_flow_control_mode = "DISABLED"
loose-group_replication_single_primary_mode = ON
loose-group_replication_communication_max_message_size = 10M
loose-group_replication_transaction_size_limit = 3G
loose-group_replication_arbitrator = 0
loose-group_replication_single_primary_fast_mode = 1
loose-group_replication_request_time_threshold = 20000
report_host = "192.168.1.10"
[client]
socket=/var/lib/mysql/mysql.sock
重启 MySQL
MGR参数及设置建议 | 说明 |
---|---|
loose-group_replication_single_primary_mode=ON | 启用单主模式。在单主模式下,只有一个节点可以处理写操作 |
loose-group_replication_bootstrap_group=OFF | 禁用群组引导模式。在引导模式下,新加入的节点会从其他节点复制数据并成为活动节点 |
loose-group_replication_transaction_size_limit = 10M | 事务大小限制为10MB。超过此大小的事务将被拒绝 |
loose-group_replication_communication_max_message_size = 10M | 群组通信中消息的最大大小为10MB |
loose-group_replication_exit_state_action = READ_ONLY | 当节点退出群组时,将其设置为只读模式 |
loose-group_replication_member_expel_timeout = 5 | 如果一个节点在5秒内没有响应,则将其从群组中驱逐 |
loose-group_replication_flow_control_mode = "DISABLED" | 禁用流控制。流控制用于控制节点的数据传输速率 |
loose-group_replication_majority_after_mode = ON | 在大多数模式之后,需要大多数节点确认事务提交 |
loose-group_replication_arbitrator = 0 | 禁用仲裁者。仲裁者用于解决群组中的冲突 |
loose-group_replication_single_primary_fast_mode = 1 | 在快速模式下,主节点选择是基于写操作的顺序,而不是基于成员的权重或状态 |
loose-group_replication_request_time_threshold = 100 | 如果一个请求在100毫秒内没有得到响应,则认为该请求超时 |
log_error_verbosity = 3 | 日志错误的详细程度设置为3 |
3、初始化MGR第二个节点
# 略
# MGR
# 加载MGR插件
loose-plugin_load_add = 'mysql_clone.so'
loose-plugin_load_add = 'group_replication.so'
loose-group_replication_group_name = "60783090-afee-4636-8f0c-fd1c534c351a"
loose-group_replication_local_address = "192.168.1.20:33061"
loose-group_replication_group_seeds = '192.168.1.10:33061,192.168.1.20:33061,192.168.1.30:33061'
loose-group_replication_start_on_boot = ON
loose-group_replication_bootstrap_group = OFF
loose-group_replication_exit_state_action = READ_ONLY
loose-group_replication_flow_control_mode = "DISABLED"
loose-group_replication_single_primary_mode = ON
loose-group_replication_communication_max_message_size = 10M
loose-group_replication_transaction_size_limit = 3G
loose-group_replication_arbitrator = 0
loose-group_replication_single_primary_fast_mode = 1
loose-group_replication_request_time_threshold = 20000
report_host = "192.168.1.20"
[client]
socket=/var/lib/mysql/mysql.sock
重启 MySQL
4、初始化MGR第三个节点
# 略
# MGR
# 加载MGR插件
loose-plugin_load_add = 'mysql_clone.so'
loose-plugin_load_add = 'group_replication.so'
loose-group_replication_group_name = "60783090-afee-4636-8f0c-fd1c534c351a"
loose-group_replication_local_address = "192.168.1.30:33061"
loose-group_replication_group_seeds = '192.168.1.10:33061,192.168.1.20:33061,192.168.1.30:33061'
loose-group_replication_start_on_boot = ON
loose-group_replication_bootstrap_group = OFF
loose-group_replication_exit_state_action = READ_ONLY
loose-group_replication_flow_control_mode = "DISABLED"
loose-group_replication_single_primary_mode = ON
loose-group_replication_communication_max_message_size = 10M
loose-group_replication_transaction_size_limit = 3G
loose-group_replication_arbitrator = 0
loose-group_replication_single_primary_fast_mode = 1
loose-group_replication_request_time_threshold = 20000
report_host = "192.168.1.30"
[client]
socket=/var/lib/mysql/mysql.sock
重启 MySQL
3、安装MySQL Shell
利用yum安装MySQL Shell,版本选择和mysql相同的8.0.35
yum -y install https://dev.mysql.com/get/mysql80-community-release-el7-11.noarch.rpm
yum -y install mysql-shell-8.0.35
4、加入第一个节点
首先,用管理员账号 root 在本地通过socket方式登入到第一个节点,会提示是否保存密码,可以选择保存
# mysqlsh -S /var/lib/mysql/mysql.sock root@localhost
Please provide the password for 'root@localhost': ******
Save password for 'root@localhost'? [Y]es/[N]o/Ne[v]er (default No): y
MySQL Shell 8.0.35
Copyright (c) 2016, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.
Type '\help' or '\?' for help; '\quit' to exit.
Creating a Classic session to 'root@localhost'
Fetching schema names for auto-completion... Press ^C to stop.
Your MySQL connection id is 10
Server version: 8.0.35 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
执行命令 \status
查看当前节点的状态,确认连接正常可用。
MySQL localhost JS > \status
MySQL Shell version 8.0.35
Connection Id: 11
Current schema:
Current user: root@localhost
SSL: Not in use.
Using delimiter: ;
Server version: 8.0.35 MySQL Community Server - GPL
Protocol version: Classic 10
Client library: 8.0.35
Connection: Localhost via UNIX socket
Unix socket: /var/lib/mysql/mysql.sock
Server characterset: utf8mb4
Schema characterset: utf8mb4
Client characterset: utf8mb4
Conn. characterset: utf8mb4
Result characterset: utf8mb4
Compression: Disabled
Uptime: 17 min 33.0000 sec
Threads: 4 Questions: 16 Slow queries: 0 Opens: 2044 Flush tables: 3 Open tables: 25 Queries per second avg: 0.015
执行 dba.configureInstance()
命令开始检查当前实例是否满足安装MGR集群的条件,如果不满足可以直接配置成为MGR集群的一个节点
MySQL localhost JS > dba.configureInstance()
Configuring local MySQL instance listening at port 3306 for use in an InnoDB cluster...
This instance reports its own address as 192.168.1.10:3306
# 提示当前的用户是管理员,不能直接用于MGR集群,需要新建一个账号
ERROR: User 'root' can only connect from 'localhost'. New account(s) with proper source address specification to allow remote connection from all instances must be created to manage the cluster.
# 使用相同的授权和密码为“root”创建远程可用帐户
1) Create remotely usable account for 'root' with same grants and password
# 用最少的授权为InnoDB集群创建一个新的管理帐户
2) Create a new admin account for InnoDB cluster with minimal required grants
# 忽略并继续
3) Ignore and continue
# 取消
4) Cancel
# 这里我们选择方案2,即创建一个最小权限账号
Please select an option [1]: 2
Please provide an account name (e.g: icroot@%) to have it created with the necessary
privileges or leave empty and press Enter to cancel.
Account Name: clusteruser # 设置账户和密码
Password for new account: ******
Confirm password: ******
applierWorkerThreads will be set to the default value of 4.
# 开启必要配置
NOTE: Some configuration options need to be fixed:
+----------------------------------------+---------------+----------------+--------------------------------------------------+
| Variable | Current Value | Required Value | Note |
+----------------------------------------+---------------+----------------+--------------------------------------------------+
| binlog_transaction_dependency_tracking | COMMIT_ORDER | WRITESET | Update the server variable |
| enforce_gtid_consistency | OFF | ON | Update read-only variable and restart the server |
| gtid_mode | OFF | ON | Update read-only variable and restart the server |
+----------------------------------------+---------------+----------------+--------------------------------------------------+
Some variables need to be changed, but cannot be done dynamically on the server.
Do you want to perform the required configuration changes? [y/n]: y
Do you want to restart the instance after configuring it? [y/n]: y
Creating user clusteruser@%.
Account clusteruser@% was successfully created.
Configuring instance...
WARNING: '@@binlog_transaction_dependency_tracking' is deprecated and will be removed in a future release. (Code 1287).
The instance '192.168.1.10:3306' was configured to be used in an InnoDB cluster.
Restarting MySQL...
NOTE: MySQL server at 192.168.1.10:3306 was restarted.
完成检查并创建完新用户后,退出当前的管理员账户,并用新创建的MGR专用账户登入,准备初始化创建一个新集群
# mysqlsh --uri clusteruser@192.168.1.10:3306
Please provide the password for 'clusteruser@192.168.1.10:3306': ******
Save password for 'clusteruser@192.168.1.10:3306'? [Y]es/[N]o/Ne[v]er (default No): yes
MySQL Shell 8.0.35
Copyright (c) 2016, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.
Type '\help' or '\?' for help; '\quit' to exit.
Creating a Classic session to 'clusteruser@192.168.1.10:3306'
Fetching schema names for auto-completion... Press ^C to stop.
Your MySQL connection id is 21
Server version: 8.0.35 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
创建一个名为MGR1
的MGR集群,并将返回的对象(通常包含集群的元数据和配置信息)存储在变量c
中,MGR1
可自定义
MySQL 192.168.1.10:3306 ssl JS > var c = dba.createCluster('MGR1');
A new InnoDB Cluster will be created on instance '192.168.1.10:3306'.
Validating instance configuration at 192.168.1.10:3306...
This instance reports its own address as 192.168.1.10:3306
Instance configuration is suitable.
NOTE: Group Replication will communicate with other members using '192.168.1.10:3306'. Use the localAddress option to override.
* Checking connectivity and SSL configuration...
Creating InnoDB Cluster 'MGR1' on '192.168.1.10:3306'...
Adding Seed Instance...
Cluster successfully created. Use Cluster.addInstance() to add MySQL instances.
At least 3 instances are needed for the cluster to be able to withstand up to
one server failure.
这就完成了MGR集群的初始化并加入第一个节点(引导节点)
5、加入其它节点
接下来,用同样方法先用 root 账号分别登入到另外两个节点,完成节点的检查并创建最小权限级别用户(过程略)
注意:各节点上创建的用户名、密码都要一致
之后回到第一个节点,执行 addInstance()
添加另外两个节点。
MySQL 192.168.1.10:3306 ssl JS > c.addInstance('clusteruser@192.168.1.20:3306');
NOTE: The target instance '192.168.1.20:3306' has not been pre-provisioned (GTID set is empty). The Shell is unable to decide whether incremental state recovery can correctly provision it.
The safest and most convenient way to provision a new instance is through automatic clone provisioning, which will completely overwrite the state of '192.168.1.20:3306' with a physical snapshot from an existing cluster member. To use this method by default, set the 'recoveryMethod' option to 'clone'.
The incremental state recovery may be safely used if you are sure all updates ever executed in the cluster were done with GTIDs enabled, there are no purged transactions and the new instance contains the same GTID set as the cluster or a subset of it. To use this method by default, set the 'recoveryMethod' option to 'incremental'.
# 选择用Clone方式从第一个节点全量复制数据
Please select a recovery method [C]lone/[I]ncremental recovery/[A]bort (default Clone): Clone
Validating instance configuration at 192.168.1.20:3306...
This instance reports its own address as 192.168.1.20:3306
Instance configuration is suitable.
NOTE: Group Replication will communicate with other members using '192.168.1.20:3306'. Use the localAddress option to override.
* Checking connectivity and SSL configuration...
A new instance will be added to the InnoDB Cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.
Adding instance to the cluster...
Monitoring recovery process of the new cluster member. Press ^C to stop monitoring and let it continue in background.
Clone based state recovery is now in progress.
NOTE: A server restart is expected to happen as part of the clone process. If the
server does not support the RESTART command or does not come back after a
while, you may need to manually start it back.
* Waiting for clone to finish...
NOTE: 192.168.1.20:3306 is being cloned from 192.168.1.10:3306
** Stage DROP DATA: Completed
** Clone Transfer
FILE COPY ############################################################ 100% Completed
PAGE COPY ############################################################ 100% Completed
REDO COPY ############################################################ 100% Completed
# 数据Clone完成,准备重启实例。如果该实例无法完成自动重启,则需要手动启动
NOTE: 192.168.1.20:3306 is shutting down...
* Waiting for server restart... ready
* 192.168.1.20:3306 has restarted, waiting for clone to finish...
** Stage RESTART: Completed
* Clone process has finished: 74.91 MB transferred in about 1 second (~74.91 MB/s)
State recovery already finished for '192.168.1.20:3306'
The instance '192.168.1.20:3306' was successfully added to the cluster.
用同样的方法,将 192.168.1.30:3306 实例也加入到集群中。
6、确认集群状态
MySQL 192.168.1.10:3306 ssl JS > c.describe()
{
"clusterName": "MGR1",
"defaultReplicaSet": {
"name": "default",
"topology": [
{
"address": "192.168.1.10:3306",
"label": "192.168.1.10:3306",
"role": "HA"
},
{
"address": "192.168.1.20:3306",
"label": "192.168.1.20:3306",
"role": "HA"
},
{
"address": "192.168.1.30:3306",
"label": "192.168.1.30:3306",
"role": "HA"
}
],
"topologyMode": "Single-Primary"
}
}
看到上面这个集群共有3个节点处于ONLINE状态,其中 192.168.1.10 是 PRIMARY 节点,其余两个都是 SECONDARY 节点,也就是说当前这个集群采用 单主 模式。如果采用多主模式,则所有节点的角色都是 PRIMARY。
六、MGR 集群维护
1、切换主节点
1、获取cluster对象
MySQL 192.168.1.10:3306 ssl JS > var c=dba.getCluster()
2、查看当前各节点列表
MySQL 192.168.1.10:3306 ssl JS > c.status()
{
"clusterName": "MGR1",
"defaultReplicaSet": {
"name": "default",
"primary": "192.168.1.10:3306",
"ssl": "REQUIRED",
"status": "OK",
"statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
"topology": {
"192.168.1.10:3306": {
"address": "192.168.1.10:3306",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
},
"192.168.1.20:3306": {
"address": "192.168.1.20:3306",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
},
"192.168.1.30:3306": {
"address": "192.168.1.30:3306",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
}
},
"topologyMode": "Single-Primary"
},
"groupInformationSourceMember": "192.168.1.10:3306"
}
3、执行切换
MySQL 192.168.1.10:3306 ssl JS > c.setPrimaryInstance('192.168.1.20:3306')
Setting instance '192.168.1.20:3306' as the primary instance of cluster 'MGR1'...
Instance '192.168.1.30:3306' remains SECONDARY.
Instance '192.168.1.10:3306' was switched from PRIMARY to SECONDARY.
Instance '192.168.1.20:3306' was switched from SECONDARY to PRIMARY.
The instance '192.168.1.20:3306' was successfully elected as primary.
MySQL 192.168.1.10:3306 ssl JS > c.status()
{
"clusterName": "MGR1",
"defaultReplicaSet": {
"name": "default",
"primary": "192.168.1.20:3306",
"ssl": "REQUIRED",
"status": "OK",
"statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
"topology": {
"192.168.1.10:3306": {
"address": "192.168.1.10:3306",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
},
"192.168.1.20:3306": {
"address": "192.168.1.20:3306",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
},
"192.168.1.30:3306": {
"address": "192.168.1.30:3306",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
}
},
"topologyMode": "Single-Primary"
},
"groupInformationSourceMember": "192.168.1.20:3306"
}
2、切换单主/多主模式
1、切换到多主模式
MySQL 192.168.1.10:3306 ssl JS > var c=dba.getCluster()
MySQL 192.168.1.10:3306 ssl JS > c.switchToMultiPrimaryMode()
Switching cluster 'MGR1' to Multi-Primary mode...
Instance '192.168.1.20:3306' remains PRIMARY.
Instance '192.168.1.10:3306' was switched from SECONDARY to PRIMARY.
Instance '192.168.1.30:3306' was switched from SECONDARY to PRIMARY.
The cluster successfully switched to Multi-Primary mode.
MySQL 192.168.1.10:3306 ssl JS > c.status()
{
"clusterName": "MGR1",
"defaultReplicaSet": {
"name": "default",
"ssl": "REQUIRED",
"status": "OK",
"statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
"topology": {
"192.168.1.10:3306": {
"address": "192.168.1.10:3306",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
},
"192.168.1.20:3306": {
"address": "192.168.1.20:3306",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
},
"192.168.1.30:3306": {
"address": "192.168.1.30:3306",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
}
},
"topologyMode": "Multi-Primary"
},
"groupInformationSourceMember": "192.168.1.10:3306"
}
2、切换到单主模式
MySQL 192.168.1.10:3306 ssl JS > c.switchToSinglePrimaryMode("192.168.1.10:3306")
Switching cluster 'MGR1' to Single-Primary mode...
Instance '192.168.1.10:3306' remains PRIMARY.
Instance '192.168.1.30:3306' was switched from PRIMARY to SECONDARY.
Instance '192.168.1.20:3306' was switched from PRIMARY to SECONDARY.
WARNING: Existing connections that expected a R/W connection must be disconnected, i.e. instances that became SECONDARY.
The cluster successfully switched to Single-Primary mode.
MySQL 192.168.1.10:3306 ssl JS > c.status()
{
"clusterName": "MGR1",
"defaultReplicaSet": {
"name": "default",
"primary": "192.168.1.10:3306",
"ssl": "REQUIRED",
"status": "OK",
"statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
"topology": {
"192.168.1.10:3306": {
"address": "192.168.1.10:3306",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
},
"192.168.1.20:3306": {
"address": "192.168.1.20:3306",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
},
"192.168.1.30:3306": {
"address": "192.168.1.30:3306",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.35"
}
},
"topologyMode": "Single-Primary"
},
"groupInformationSourceMember": "192.168.1.10:3306"
}
注意:在已经是单主模式时,无论是 group_replication_switch_to_single_primary_mode()
还是 switchToSinglePrimaryMode()
函数中指定另一个节点时是不会发生切换的,但也不会报错,只有提示。
3、添加新节点
先执行MySQL Server初始化,并执行 dba.dba.configureInstance()
创建MGR专用账号后。而后,连接到Primary节点,直接调用 addInstance()
函数即可:
# mysqlsh --uri clusteruser@192.168.1.10:3306
MySQL 192.168.1.10:3306 ssl JS > var c=dba.getCluster()
MySQL 192.168.1.10:3306 ssl JS > c.addInstance('clusteruser@192.168.1.40:3306');
WARNING: A GTID set check of the MySQL instance at '192.168.1.40:3306' determined that it contains transactions that do not originate from the cluster, which must be discarded before it can join the cluster.
...
NOTE: 192.168.1.40:3306 is being cloned from 192.168.1.10:3306 #<--自动选择一个donor节点
** Stage DROP DATA: Completed
** Clone Transfer
FILE COPY ############################################################ 100% Completed
PAGE COPY ############################################################ 100% Completed
REDO COPY ############################################################ 100% Completed
NOTE: 192.168.1.40:3306 is shutting down...
* Waiting for server restart... ready
* 192.168.1.40:3306 has restarted, waiting for clone to finish...
** Stage RESTART: Completed
* Clone process has finished: 72.43 MB transferred in about 1 second (~72.43 MB/s)
Incremental state recovery is now in progress.
* Waiting for distributed recovery to finish...
NOTE: '192.168.1.40:3306' is being recovered from '192.168.1.30:3306'
* Distributed recovery has finished
The instance '192.168.1.40:3306' was successfully added to the cluster.
查看节点状态
MySQL 192.168.1.10:3306 ssl JS > c.describe()
{
"clusterName": "MGR1",
"defaultReplicaSet": {
"name": "default",
"topology": [
{
"address": "192.168.1.10:3306",
"label": "192.168.1.10:3306",
"role": "HA"
},
{
"address": "192.168.1.20:3306",
"label": "192.168.1.20:3306",
"role": "HA"
},
{
"address": "192.168.1.30:3306",
"label": "192.168.1.30:3306",
"role": "HA"
},
{
"address": "192.168.1.40:3306", <--新加入的节点
"label": "192.168.1.40:3306",
"role": "HA"
}
],
"topologyMode": "Single-Primary"
}
}
确认新节点添加成功。
4、删除节点
MySQL 192.168.1.10:3306 ssl JS > c.removeInstance('192.168.1.40:3306');
The instance will be removed from the InnoDB cluster. Depending on the instance
being the Seed or not, the Metadata session might become invalid. If so, please
start a new session to the Metadata Storage R/W instance.
Instance '192.168.1.40:3306' is attempting to leave the cluster...
The instance '192.168.1.40:3306' was successfully removed from the cluster.
这就将该节点踢出集群了,并且会重置 group_replication_group_seeds
和 group_replication_local_address
两个选项值。之后该节点如果想再加入集群,需要调用 addInstance()
重新加回。
5、异常退出的节点重新加回
在MySQL Shell里,可以调用 rejoinInstance()
函数将异常的节点重新加回集群:
MySQL 192.168.1.10:3306 ssl JS > c.rejoinInstance('192.168.1.40:3306');
Rejoining instance '192.168.1.40:3306' to cluster 'MGR1'...
The instance '192.168.1.40:3306' was successfully rejoined to the cluster.
6、重启MGR集群
如果是用MySQL Shell重启MGR集群,调用 rebootClusterFromCompleteOutage()
函数即可,它会自动判断各节点的状态,选择其中一个作为Primary节点,然后拉起各节点上的MGR服务,完成MGR集群重启。
MySQL 192.168.1.10:3306 ssl JS > dba.rebootClusterFromCompleteOutage()
七、MGR 集群监控
MGR和传统主从复制类似,在运行过程中主要关注各节点的运行状态,以及Secondary节点的事务是否有延迟。本文介绍如何监控MGR节点状态、事务状态等。
1、节点状态监控
通过查询 performance_schema.replication_group_members
表即可知道MGR各节点的状态:
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 173abbd5-b1ce-11ee-bc64-000c29479e2c | 192.168.1.10 | 3306 | ONLINE | PRIMARY | 8.0.35 | MySQL |
| group_replication_applier | 53c96c39-b1ce-11ee-adfe-000c295f2d46 | 192.168.1.30 | 3306 | ONLINE | SECONDARY | 8.0.35 | MySQL |
| group_replication_applier | 69a832ef-b1ce-11ee-b3f3-000c295e28a1 | 192.168.1.20 | 3306 | ONLINE | SECONDARY | 8.0.35 | MySQL |
+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+----------------------------+
输出结果中主要几个列的解读如下:
- MEMBER_ID 列值就是各节点的 server_uuid,用于唯一标识每个节点,在命令行模式下,调用 udf 时传入 MEMBER_ID 以指定各节点。
- MEMBER_ROLE 表示各节点的角色,如果是 PRIMARY 则表示该节点可接受读写事务,如果是 SECONDARY 则表示该节点只能接受只读事务。如果只有一个节点是 PRIMARY,其余都是 SECONDARY,则表示当前处于 单主模式;如果所有节点都是 PRIMARY,则表示当前处于 多主模式。
- MEMBER_STATE 表示各节点的状态,共有几种状态:ONLINE、RECOVERING、OFFLINE、ERROR、UNREACHABLE 等,下面分别介绍几种状态。
- ONLINE,表示节点处于正常状态,可提供服务。
- RECOVERING,表示节点正在进行分布式恢复,等待加入集群,这时候有可能正在从donor节点利用clone复制数据,或者传输binlog中。
- OFFLINE,表示该节点当前处于离线状态。提醒,在正要加入或重加入集群时,可能也会有很短瞬间的状态显示为 OFFLINE。
- ERROR,表示该节点当前处于错误状态,无法成为集群的一员。当节点正在进行分布式恢复或应用事务时,也是有可能处于这个状态的。当节点处于ERROR状态时,是无法参与集群事务裁决的。节点正在加入或重加入集群时,在完成兼容性检查成为正式MGR节点前,可能也会显示为ERROR状态。
- UNREACHABLE,当组通信消息收发超时时,故障检测机制会将本节点标记为怀疑状态,怀疑其可能无法和其他节点连接,例如当某个节点意外断开连接时。当在某个节点上看到其他节点处于 UNREACHABLE 状态时,有可能意味着此时部分节点发生了网络分区,也就是多个节点分裂成两个或多个子集,子集内的节点可以互通,但子集间无法互通。
当节点的状态不是 ONLINE 时,就应当立即发出告警并检查发生了什么。
在节点状态发生变化时,或者有节点加入、退出时,表 performance_schema.replication_group_members
的数据都会更新,各节点间会交换和共享这些状态信息,因此可以在任意节点查看。
2、MGR事务状态监控
另一个需要重点关注的是Secondary节点的事务状态,更确切的说是关注待认证事务及待应用事务队列大小。
可以执行下面的命令查看当前除了 PRIMARY 节点外,其他节点的 trx_tobe_certified
或 relaylog_tobe_applied
值是否较大:
mysql> SELECT MEMBER_ID AS id, COUNT_TRANSACTIONS_IN_QUEUE AS trx_tobe_certified, COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE AS relaylog_tobe_applied, COUNT_TRANSACTIONS_CHECKED AS trx_chkd, COUNT_TRANSACTIONS_REMOTE_APPLIED AS trx_done, COUNT_TRANSACTIONS_LOCAL_PROPOSED AS proposed FROM performance_schema.replication_group_member_stats;
+--------------------------------------+--------------------+-----------------------+----------+----------+----------+
| id | trx_tobe_certified | relaylog_tobe_applied | trx_chkd | trx_done | proposed |
+--------------------------------------+--------------------+-----------------------+----------+----------+----------+
| 173abbd5-b1ce-11ee-bc64-000c29479e2c | 0 | 0 | 76 | 5 | 76 |
| 53c96c39-b1ce-11ee-adfe-000c295f2d46 | 0 | 0 | 6 | 6 | 0 |
| 69a832ef-b1ce-11ee-b3f3-000c295e28a1 | 0 | 0 | 15 | 17 | 0 |
+--------------------------------------+--------------------+-----------------------+----------+----------+----------+
其中,relaylog_tobe_applied
的值表示远程事务写到relay log后,等待回放的事务队列,trx_tobe_certified
表示等待被认证的事务队列大小,这二者任何一个值大于0,都表示当前有一定程度的延迟。
还可以通过关注上述两个数值的变化,看看两个队列是在逐步加大还是缩小,据此判断Primary节点是否"跑得太快"了,或者Secondary节点是否"跑得太慢"。
多提一下,在启用流控(flow control)时,上述两个值超过相应的阈值时(group_replication_flow_control_applier_threshold 和 group_replication_flow_control_certifier_threshold 默认阈值都是 25000),就会触发流控机制。
3、其他监控
另外,也可以查看接收到的事务和已执行完的事务之间的差距来判断:
mysql> SELECT RECEIVED_TRANSACTION_SET FROM performance_schema.replication_connection_status WHERE channel_name = 'group_replication_applier' UNION ALL SELECT variable_value FROM performance_schema.global_variables WHERE variable_name = 'gtid_executed'\G
*************************** 1. row ***************************
RECEIVED_TRANSACTION_SET: 173abbd5-b1ce-11ee-bc64-000c29479e2c:1-4,
5888afa1-b1d2-11ee-b20a-000c29479e2c:1-76,
5888b41f-b1d2-11ee-b20a-000c29479e2c:1-5
*************************** 2. row ***************************
RECEIVED_TRANSACTION_SET: 173abbd5-b1ce-11ee-bc64-000c29479e2c:1-4,
5888afa1-b1d2-11ee-b20a-000c29479e2c:1-76,
5888b41f-b1d2-11ee-b20a-000c29479e2c:1-5
八、MySQL Router + MGR集群
MySQL Router是一个轻量级的中间件,它采用多端口的方案实现读写分离以及读负载均衡,而且同时支持mysql和mysql x协议。
建议把MySQL Router部署在应用服务器上,每个应用服务器都部署一套,这样应用程序可以直接连接本机IP,连接的效率更高,而且后端数据库发生变化时,程序端也无需修改IP配置。
1、部署MySQL Router
# yum -y install mysql-router
初始化
# mysqlrouter --bootstrap clusteruser@192.168.1.10:3306 --user=mysqlrouter
Please enter MySQL password for clusteruser: # 输入clusteruser密码
# Bootstrapping system MySQL Router 8.0.35 (MySQL Community - GPL) instance...
- Creating account(s) (only those that are needed, if any)
- Verifying account (using it to run SQL queries that would be run by Router)
- Storing account in keyring
- Adjusting permissions of generated files
- Creating configuration /etc/mysqlrouter/mysqlrouter.conf
Existing configuration backed up to '/etc/mysqlrouter/mysqlrouter.conf.bak'
# MySQL Router configured for the InnoDB Cluster 'MGR1'
After this MySQL Router has been started with the generated configuration
$ /etc/init.d/mysqlrouter restart
or
$ systemctl start mysqlrouter
or
$ mysqlrouter -c /etc/mysqlrouter/mysqlrouter.conf
InnoDB Cluster 'MGR1' can be reached by connecting to:
# MySQL协议的两个端口
## MySQL Classic protocol
- Read/Write Connections: localhost:6446
- Read/Only Connections: localhost:6447
# MySQL X协议的两个端口
## MySQL X protocol
- Read/Write Connections: localhost:6448
- Read/Only Connections: localhost:6449
参数 | 说明 |
---|---|
--bootstrap | 以 bootstrap 模式启动。在 bootstrap 模式下,mysqlrouter 会创建一个新的集群并成为其初始节点 |
clusteruser@192.168.1.10:3306 | MGR服务专用账号 |
--user=mysqlrouter | 运行mysqlrouter进程的系统用户名 |
--name | 指定新集群名称 |
--directory | 指定一个目录,该目录包含有关新集群的配置信息 |
2、启动mysqlrouter服务
这就初始化完毕了,按照上面的提示,直接启动 mysqlrouter 服务即可:
# systemctl start mysqlrouter
# ps -ef | grep -v grep | grep mysqlrouter
mysqlro+ 61432 1 1 14:49 ? 00:00:00 /usr/bin/mysqlrouter
# netstat -lntp | grep mysqlrouter
tcp 0 0 0.0.0.0:6446 0.0.0.0:* LISTEN 61432/mysqlrouter
tcp 0 0 0.0.0.0:6447 0.0.0.0:* LISTEN 61432/mysqlrouter
tcp 0 0 0.0.0.0:6448 0.0.0.0:* LISTEN 61432/mysqlrouter
tcp 0 0 0.0.0.0:6449 0.0.0.0:* LISTEN 61432/mysqlrouter
tcp 0 0 0.0.0.0:8443 0.0.0.0:* LISTEN 61432/mysqlrouter
可以看到 mysqlrouter 服务正常启动了。
mysqlrouter 初始化时自动生成的配置文件是 /etc/mysqlrouter/mysqlrouter.conf
,主要是关于R/W、RO不同端口的配置,例如:
[routing:bootstrap_rw]
bind_address=0.0.0.0
bind_port=6446
destinations=metadata-cache://MGR1/?role=PRIMARY
routing_strategy=first-available
protocol=classic
可以根据需要自行修改绑定的IP地址和端口,也可以在初始化时指定 --conf-base-port
选项自定义初始端口号。
3、确认读写分离效果
现在,用客户端连接到6446(读写)端口,确认连接的是PRIMARY节点
# mysql -h192.168.1.10 -u clusteruser -p -P6446
Enter password:
mysql> select MEMBER_ID,MEMBER_ROLE from performance_schema.replication_group_members;
+--------------------------------------+-------------+
| MEMBER_ID | MEMBER_ROLE |
+--------------------------------------+-------------+
| 173abbd5-b1ce-11ee-bc64-000c29479e2c | PRIMARY |
| 53c96c39-b1ce-11ee-adfe-000c295f2d46 | SECONDARY |
| 69a832ef-b1ce-11ee-b3f3-000c295e28a1 | SECONDARY |
+--------------------------------------+-------------+
mysql> select @@server_uuid;
+--------------------------------------+
| @@server_uuid |
+--------------------------------------+
| 173abbd5-b1ce-11ee-bc64-000c29479e2c | # PRIMARY节点
+--------------------------------------+
同样地,连接6447(只读)端口,确认连接的是SECONDARY节点:
# mysql -h192.168.1.10 -u clusteruser -p -P6447
Enter password:
mysql> select @@server_uuid;
+--------------------------------------+
| @@server_uuid |
+--------------------------------------+
| 69a832ef-b1ce-11ee-b3f3-000c295e28a1 | # SECONDARY节点
+--------------------------------------+
4、确认只读负载均衡效果
MySQL Router连接读写节点(Primary节点)默认的策略是 first-available,即只连接第一个可用的节点。Router连接只读节点(Secondary节点)默认的策略是 round-robin-with-fallback,会在各个只读节点间轮询。
保持6447端口原有的连接不退出,继续新建到6447端口的连接,查看 server_uuid,这时应该会发现读取到的是其他只读节点的值,因为 mysqlrouter 的读负载均衡机制是在几个只读节点间自动轮询。在默认的 round-robin-with-fallback 策略下,只有当所有只读节点都不可用时,只读请求才会打到PRIMARY节点上。
5、确认故障自动转移功能
接下来模拟PRIMARY节点宕机或切换时,mysqlrouter 也能实现自动故障转移。
登入MGR集群任意节点,切换PRIMARY节点
# mysqlsh --uri clusteruser@192.168.1.10:3306
MySQL 192.168.1.10:3306 ssl JS > var c=dba.getCluster();
MySQL 192.168.1.10:3306 ssl JS > c.setPrimaryInstance('192.168.1.20:3306');
Setting instance '192.168.1.20:3306' as the primary instance of cluster 'MGR1'...
Instance '192.168.1.30:3306' remains SECONDARY.
Instance '192.168.1.10:3306' was switched from PRIMARY to SECONDARY.
Instance '192.168.1.20:3306' was switched from SECONDARY to PRIMARY.
The instance '192.168.1.20:3306' was successfully elected as primary.
回到前面连接6446端口的那个会话,再次查询 server_uuid,此时会发现连接了新节点
# mysql -h192.168.1.10 -u clusteruser -p -P6446
mysql> select @@server_uuid;
+--------------------------------------+
| @@server_uuid |
+--------------------------------------+
| 69a832ef-b1ce-11ee-b3f3-000c295e28a1 |
+--------------------------------------+
这就实现了自动故障转移。
九、MGR 5.7滚动升级至8.0
MGR 5.7集群滚动升级至8.0可以分为以下几步:
- 在现有MGR 5.7集群中,新增MySQL 8.0的Secondary节点。
- 一比一下线一个MySQL 5.7的Secondary节点。
- 如此往复,直到剩下最后一个MySQL 5.7的Primary节点。
- 再次上线一个MySQL 8.0的Secondary节点。
- 停止最后一个MySQL 5.7的Primary节点,这时会切换主节点,并且选择其中一个MySQL 8.0节点作为新的Primary节点,这就完成升级了。
现在,利用MySQL Shell删除一个旧节点:
MySQL 192.168.1.10:3306 ssl JS > c.removeInstance('127.0.0.1:4308');
The instance will be removed from the InnoDB cluster. Depending on the instance
being the Seed or not, the Metadata session might become invalid. If so, please
start a new session to the Metadata Storage R/W instance.
Instance '127.0.0.1:4308' is attempting to leave the cluster...
WARNING: On instance '127.0.0.1:4308' configuration cannot be persisted since MySQL version 5.7.36 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please set the 'group_replication_start_on_boot' variable to 'OFF' in the server configuration file, otherwise it might rejoin the cluster upon restart.
WARNING: Instance '127.0.0.1:4306' cannot persist configuration since MySQL version 5.7.36 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the dba.configureLocalInstance() command locally to persist the changes.
The instance '127.0.0.1:4308' was successfully removed from the cluster.
如此往复,直到只剩最后一个5.7节点,关闭最后一个5.7节点,会自动切换主节点:
MySQL 127.0.0.1:4306 ssl JS > c.removeInstance('127.0.0.1:4306');
The instance will be removed from the InnoDB cluster. Depending on the instance
being the Seed or not, the Metadata session might become invalid. If so, please
start a new session to the Metadata Storage R/W instance.
Instance '127.0.0.1:4306' is attempting to leave the cluster...
WARNING: On instance '127.0.0.1:4306' configuration cannot be persisted since MySQL version 5.7.36 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please set the 'group_replication_start_on_boot' variable to 'OFF' in the server configuration file, otherwise it might rejoin the cluster upon restart.
The instance '127.0.0.1:4306' was successfully removed from the cluster.
之后可以看到5.7节点全部下线了,只剩下8.0节点:
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | c8ec34c4-78fc-11ec-864a-888888888333 | 127.0.0.1 | 3309 | ONLINE | PRIMARY | 8.0.25 |
| group_replication_applier | c8ec34c4-78fc-11ec-864a-888888888444 | 127.0.0.1 | 3310 | ONLINE | SECONDARY | 8.0.25 |
| group_replication_applier | c8ec34c4-78fc-11ec-864a-888888888555 | 127.0.0.1 | 3311 | ONLINE | SECONDARY | 8.0.25 |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
完成滚动升级
评论区