MySQL Group Replication学习记录

MySQL 常见的高可用方案有 MHA，Galera，Orchestrator，以及官方的 Group Replication（MGR），本文是 MGR 多主集群的搭建及维护学习记录。

Contents

1 MGR特点
2 搭建
3 运维问题
4 参考资料

MGR特点

支持单主和多主模式
基于 Paxos 算法，实现数据复制的一致性
插件化设计，支持插件检测，新增节点小于集群当前节点主版本号，拒绝加入集群，大于则加入，但无法作为主节点
没有第三方组件依赖
支持全链路 SSL 通讯
支持 IP 白名单
不依赖网络多播

搭建

可以使用 vagrant 启动 3 台虚拟机。分配 IP 如下：

itop-mgr-1	192.168.10.101
itop-mgr-2	192.168.10.102
itop-mgr-3	192.168.10.103

mgr 节点IP

安装

MySQL 版本选择最新的 8.x，用以下命令安装：

yum install -y https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpm
yum install -y mysql-community-server mysql-shell mysql-router

手动配置

可以手动配置一个 mgr 集群，也可以用 mysql-shell 来配置。本文将记录手动配置过程，然后用 mysql-shell 来接管。

my.cnf

my.cnf 配置示例。

[mysqldump]
max_allowed_packet = 64M
[mysqld]
server-id=$ID
innodb_buffer_pool_size = 768M
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
# Binary logging and Replication
log_bin = mysql-bin
binlog_format = ROW
binlog_checksum = NONE # or CRC32
master_verify_checksum = OFF # ON if binlog_checksum = CRC32
slave_sql_verify_checksum = OFF # ON if binlog_checksum = CRC32
binlog_cache_size = 1M
binlog_stmt_cache_size = 3M
max_binlog_size = 512M
sync_binlog = 1
expire_logs_days = 7
log_slave_updates = 1
relay_log = mysql-relay-bin
relay_log_purge = 1
# Group Replication parameter
gtid_mode = ON
enforce_gtid_consistency = ON
master_info_repository = TABLE
relay_log_info_repository = TABLE
 
slave_parallel_workers = 10
slave_preserve_commit_order = ON
slave_parallel_type = LOGICAL_CLOCK
#以便在server收集写集合的同时将其记录到二进制日志。写集合基于每行的主键，并且是行更改后的唯一标识此标识将用于检测冲突。
transaction_write_set_extraction = XXHASH64
#组的名字可以随便起,但不能用主机的GTID! 所有节点的这个组名必须保持一致！
loose-group_replication_group_name = "abcdaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" 
##为了避免每次启动自动引导具有相同名称的第二个组,所以设置为OFF。
loose-group_replication_start_on_boot = OFF
loose-group_replication_local_address = "$MYIP:33061"
loose-group_replication_group_seeds = "$NODE1:33061,$NODE2:33061,$NODE3:33061"
#为了避免每次启动自动引导具有相同名称的第二个组,所以设置为OFF。  
loose-group_replication_bootstrap_group = OFF
##关闭单主模式的参数（本例测试时多主模式，所以关闭该项,开启多主模式的参数
loose-group_replication_single_primary_mode = OFF # = multi-primary
loose-group_replication_enforce_update_everywhere_checks=ON # = multi-primary
report_host=$MYIP
report_port=3306
# 允许加入组复制的客户机来源的ip白名单
loose-group_replication_ip_whitelist="192.168.10.0/24,127.0.0.1/8"

账号设置

为了方便在 vagrant 中用脚本操作，先重新用 insecure 方式初始化 MySQL。

	rm -fr /var/lib/mysql
	mysqld --initialize-insecure --user=mysql
	systemctl start mysqld
	# (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)，重启生成 sock 文件
	systemctl restart mysqld
	systemctl status mysqld
	mysql -uroot -e "show databases;"

然后修改 root 账号的密码。

	cat > /tmp/init.sql <<EOF
ALTER USER 'root'@'localhost' IDENTIFIED BY '$MYSQL_ROOT' PASSWORD EXPIRE NEVER;
ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY '$MYSQL_ROOT';
create user root@'%' identified WITH mysql_native_password BY '$MYSQL_ROOT';
grant all privileges on *.* to root@'%' with grant option;
flush privileges;
EOF
	mysql -uroot < /tmp/init.sql

设置复制账号。

cat > /tmp/rep.sql <<EOF
SET SQL_LOG_BIN=0;
CREATE USER IF NOT EXISTS repl@'%' IDENTIFIED WITH 'mysql_native_password' BY 'repl';
GRANT SUPER,REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO repl@'%';
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;
CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='repl' FOR CHANNEL 'group_replication_recovery';
EOF

if [ ! -f $LOCK ];then
	mysql -uroot -p$MYSQL_ROOT < /tmp/rep.sql
	touch $LOCK
fi

安装MGR插件

# install mgr
cat > /tmp/mgr.sql <<EOF
INSTALL PLUGIN group_replication SONAME 'group_replication.so';
SHOW PLUGINS;
EOF

mysql -uroot -p$MYSQL_ROOT -e "SHOW PLUGINS" |grep -q "group_replication" || mysql -uroot -p$MYSQL_ROOT < /tmp/mgr.sql

启动MGR

在第一个节点上启动 MGR，另外两个节点加入 MGR。

# start mgr
cat > /tmp/start.sql <<EOF
SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;
SELECT * FROM performance_schema.replication_group_members;
EOF

# only run on first node.
[ "$ID"x == "10101"x ] && mysql -uroot -p$MYSQL_ROOT < /tmp/start.sql

# join
cat > /tmp/join.sql <<EOF
SELECT * FROM performance_schema.replication_group_members;
show global variables like '%seed%';
START GROUP_REPLICATION;
SELECT * FROM performance_schema.replication_group_members;
SHOW STATUS LIKE 'group_replication_primary_member';
show global variables like 'group_replication_single%';
EOF

# run on other two node
[ "$ID"x == "10101"x ] || mysql -uroot -p$MYSQL_ROOT < /tmp/join.sql

group_replication_bootstrap_group 选项，为了避免每次启动自动引导具有相同名称的第二个组，所以需要设置为 OFF。因此执行 START GROUP_REPLICATION 之后需要关闭。

使用mysql-shell接管MGR

如果在已经配置好的组复制上创建 InnoDB Cluster，并且希望使用它来创建集群，可将 adoptFromGR 选项传递给 dba.createCluster() 函数。创建的InnoDB Cluster 会匹配复制组是以单主数据库还是多主数据库运行。

# 使用 mysqlsh 管理集群
# MGR集群接管：如果在已经配置好的组复制上创建InnoDB Cluster，并且希望使用它来创建集群，可将adoptFromGR选项传递给dba.createCluster()函数。创建的InnoDB Cluster会匹配复制组是以单主数据库还是多主数据库运行。要采用现有的组复制组，使用MySQL Shell连接到组成员。
# 在最后一个节点上操作，即等所有节点启动后在操作
JSDIR="/vagrant/js"
MYSQLSHLOG=/tmp/mysql.log

if [ "$ID"x == "10103"x ];then
	echo "Run Get Status"
	mysqlsh --js --file=$JSDIR/status.js > $MYSQLSHLOG 2>&1
	grep "RuntimeError" $MYSQLSHLOG && r=0 || r=1
	if [ $r -eq 0 ];then
		grep "but GR is not active" $MYSQLSHLOG && r=0 || r=1
		if [ $r -eq 0 ];then
			echo "Cluster Need Reboot"
			mysqlsh --js --file=$JSDIR/reboot.js
		else
			echo "Init Cluster"
			mysqlsh --js --file=$JSDIR/init.js
		fi
	else
		cat $MYSQLSHLOG
	fi
fi

status.js

shell.connect('root@192.168.10.101', 'root');
var cluster = dba.getCluster('itopMgr');
cluster.status();

init.js

shell.connect('root@192.168.10.101', 'root');
var cluster = dba.createCluster('itopMgr', {adoptFromGR: true});
cluster.status();

reboot.js

shell.connect('root@192.168.10.101', 'root');
dba.rebootClusterFromCompleteOutage('itopMgr')
var cluster = dba.getCluster('itopMgr');
cluster.status();

如果从一开始就使用 mysql-shell 创建并管理集群，可以参考以下命令。

##链接主节点
MySQL  JS >\connect root@192.168.10.101:3306

##创建集群
MySQL  192.168.10.101:3306 ssl  JS  > dba.createCluster('mgrCluster')
MySQL  192.168.10.101:3306 ssl  JS > var cluster = dba.getCluster()

##添加节点
MySQL  192.168.10.101:3306 ssl  JS > cluster.addInstance('root@192.168.10.102:3306')
MySQL  192.168.10.101:3306 ssl  JS > cluster.addInstance('root@192.168.10.103:3306')

##查看集群状态
MySQL  192.168.10.101:3306 ssl  JS > cluster.status()

##解散InnoDB Cluster
MySQL  192.168.10.101:3306 ssl  JS > cluster.dissolve()

##配置新主选举权重
MySQL  192.168.10.101:3306 ssl  JS >  var mycluster = dba.getCluster()
MySQL  192.168.10.101:3306 ssl  JS > mycluster.addInstance('root@192.168.10.102:3306', {memberWeight:25})
MySQL  192.168.10.101:3306 ssl  JS > mycluster.addInstance('root@192.168.10.103:3306', {memberWeight:50})

##指定一个新的主节点
MySQL  192.168.10.101:3306 ssl  JS > cluster.setPrimaryInstance('192.168.10.102:3306')

##Cluster.switchToMultiPrimaryMode()切换到多主模式：
MySQL  192.168.10.101:3306 ssl  JS > cluster.switchToMultiPrimaryMode()


##Cluster.switchToSinglePrimaryMode()切换到单主模式
MySQL  192.168.10.101:3306 ssl  JS > cluster.switchToSinglePrimaryMode('172.16.1.125:3306')

运维问题

节点状态

ONLINE - 节点状态正常。
OFFLINE - 实例在运行，但没有加入任何Cluster。
RECOVERING - 实例已加入Cluster，正在同步数据。
ERROR - 同步数据发生异常。
UNREACHABLE - 与其他节点通讯中断，可能是网络问题，可能是节点crash。
MISSING 节点已加入集群，但未启动group replication

集群状态

OK – 所有节点处于online状态，有冗余节点。
OK_PARTIAL – 有节点不可用，但仍有冗余节点。
OK_NO_TOLERANCE – 有足够的online节点，但没有冗余，例如：两个节点的Cluster，其中一个挂了，集群就不可用了。
NO_QUORUM – 有节点处于online状态，但达不到法定节点数，此状态下Cluster无法写入，只能读取。
UNKNOWN – 不是online或recovering状态，尝试连接其他实例查看状态。
UNAVAILABLE – 组内节点全是offline状态，但实例在运行，可能实例刚重启还没加入Cluster。

重新加入集群

如果节点故障后重启，状态为 MISSING 时，需要手动重新加入集群。

var cluster = dba.getCluster("itopMgr")
cluster.rejoinInstance('root@192.168.10.102:3306')

重启集群

当集群所有节点掉线，比如所有机器都宕机后启动，可以用 mysql-shell 重启集群。

shell.connect('root@192.168.10.101', 'root');
dba.rebootClusterFromCompleteOutage('itopMgr')
var cluster = dba.getCluster('itopMgr');

当使用 vagrant reload 虚拟机的时候，1 号虚机最先关机，这时候 2，3 号如果还在写入，会比 1 号数据新，当 reload 完成之后，如果还默认以1 号为准，用 SQL 语句启动集群，会出现以下问题：

Rejoining instance to the cluster ...


ERROR: A GTID set check of the MySQL instance at '192.168.10.102:3306' determined that it contains transactions that do not originate from the cluster, which must be discarded before it can join the cluster.

192.168.10.102:3306 has the following errant GTIDs that do not exist in the cluster:
abcdaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1002598-1002690

Having extra GTID events is not expected, and it is recommended to investigate this further and ensure that the data can be removed prior to rejoining the instance to the cluster.

Discarding these extra GTID events can either be done manually or by completely overwriting the state of 192.168.10.102:3306 with a physical snapshot from an existing cluster member. To achieve this remove the instance from the cluster and add it back using <Cluster>.addInstance() and setting the 'recoveryMethod' option to 'clone'.

Cluster.rejoinInstance: The instance '192.168.10.102:3306' contains errant transactions that did not originate from the cluster. (RuntimeError)

因此这种情况用 mysql-shell 来重启比较好。

当已经出现这种情况的时候，用下面的命令查看所有 member。

[root@itop-mgr-1 ~]# ./run.sh "select * from performance_schema.replication_group_members;"
RUN on 192.168.10.101
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier |           |             |        NULL | OFFLINE      |             |                |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+
RUN on 192.168.10.102
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST    | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 4dc877f0-becf-11ea-bf0b-5254004d77d3 | 192.168.10.102 |        3306 | OFFLINE      |             |                |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
RUN on 192.168.10.103
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier |           |             |        NULL | OFFLINE      |             |                |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+

然后以 MEMBER_ID 和 MEMBER_HOST 不为空的节点为基础，启动 group_replication。

SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

然后启动 3 号节点

START GROUP_REPLICATION;

1 号节点无法直接加入，需要先移除在加入，并且使用 clone 方式恢复数据。

MySQL  192.168.10.102:33060+ ssl  JS > c.addInstance('root@192.168.10.101:3306', {recoveryMethod:'clone'})

WARNING: A GTID set check of the MySQL instance at '192.168.10.101:3306' determined that it contains transactions that do not originate from the cluster, which must be discarded before it can join the cluster.

192.168.10.101:3306 has the following errant GTIDs that do not exist in the cluster:
abcdaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:4218-8557

WARNING: Discarding these extra GTID events can either be done manually or by completely overwriting the state of 192.168.10.101:3306 with a physical snapshot from an existing cluster member. To use this method by default, set the 'recoveryMethod' option to 'clone'.

Having extra GTID events is not expected, and it is recommended to investigate this further and ensure that the data can be removed prior to choosing the clone recovery method.
Clone based recovery selected through the recoveryMethod option

NOTE: Group Replication will communicate with other members using '192.168.10.101:33061'. Use the localAddress option to override.

Validating instance configuration at 192.168.10.101:3306...

This instance reports its own address as 192.168.10.101:3306

Instance configuration is suitable.
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster...

ERROR: Unable to enable clone on the instance '192.168.10.103:3306': Recovery user 'repl' not created by InnoDB Cluster

Monitoring recovery process of the new cluster member. Press ^C to stop monitoring and let it continue in background.
Clone based state recovery is now in progress.

NOTE: A server restart is expected to happen as part of the clone process. If the
server does not support the RESTART command or does not come back after a
while, you may need to manually start it back.

* Waiting for clone to finish...
NOTE: 192.168.10.101:3306 is being cloned from 192.168.10.103:3306
** Stage DROP DATA: Completed
** Clone Transfer  
    FILE COPY  ############################################################  100%  Completed
    PAGE COPY  ############################################################  100%  Completed
    REDO COPY  ############################################################  100%  Completed
** Stage RECOVERY: \
NOTE: 192.168.10.101:3306 is shutting down...

* Waiting for server restart... ready
* 192.168.10.101:3306 has restarted, waiting for clone to finish...
* Clone process has finished: 109.54 MB transferred in 4 sec (27.38 MB/s)

State recovery already finished for '192.168.10.101:3306'

The instance '192.168.10.101:3306' was successfully added to the cluster.

执行完成之后，三个节点上 SELECT @@GTID_EXECUTED 应返回相同的结果：

[root@itop-mgr-1 ~]# ./run.sh "SELECT @@GTID_EXECUTED;"
RUN on 192.168.10.101
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------------------------------------------------------------+
| @@GTID_EXECUTED                                                     |
+---------------------------------------------------------------------+
| abcdaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-4686:1001062-1002871:2001062 |
+---------------------------------------------------------------------+
RUN on 192.168.10.102
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------------------------------------------------------------+
| @@GTID_EXECUTED                                                     |
+---------------------------------------------------------------------+
| abcdaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-4686:1001062-1002871:2001062 |
+---------------------------------------------------------------------+
RUN on 192.168.10.103
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------------------------------------------------------------+
| @@GTID_EXECUTED                                                     |
+---------------------------------------------------------------------+
| abcdaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-4686:1001062-1002871:2001062 |
+---------------------------------------------------------------------+

mysql-shell非交互式

使用 'interactive': false 选项。

dba.configureLocalInstance('root@$MYIP:3306', {'password': 'root', 'interactive': false})

执行 dba.rebootClusterFromCompleteOutage 时会要求用户确认是否加入集群 (y/N) ，脚本中需要强制 yes，可以用以下方法，注意列表里不要写当前操作的节点 IP。

dba.rebootClusterFromCompleteOutage('itopMgr',{rejoinInstances:["192.168.10.102:3306","192.168.10.103:3306"]})

执行 addInstance 时指定 recoveryMethod 选项，避免交互。

cluster.addInstance('root@192.168.10.203:3306', {'localAddress': '192.168.10.203', 'password': 'root','recoveryMethod':'clone'})

常用命令

集群验证 SQL 语句

select * from performance_schema.replication_group_members; #All members should be online.
select instance_name, mysql_server_uuid, addresses from  mysql_innodb_cluster_metadata.instances;  # All instances should return same value for mysql_server_uuid
SELECT @@GTID_EXECUTED;  #All nodes should return same value

常用语句

mysql> SET SQL_LOG_BIN = 0; 
mysql> stop group_replication;
mysql> set global super_read_only=0;
mysql> drop database mysql_innodb_cluster_metadata;
mysql> RESET MASTER;
mysql> RESET SLAVE ALL;
JS > var cluster = dba.getCluster()
JS > var cluster = dba.getCluster("<Cluster_name>")
JS > var cluster = dba.createCluster('name')
JS > cluster.removeInstance('root@<IP_Address>:<Port_No>',{force: true})
JS > cluster.addInstance('root@<IP add>,:<port>')
JS > cluster.addInstance('root@<IP add>,:<port>')
JS > dba.getCluster()
JS > dba.getCluster().status() 
JS > dba.getCluster().checkInstanceState('root@<IP_Address>:<Port_No>')
JS > dba.getCluster().rejoinInstance('root@<IP_Address>:<Port_No>')
JS > dba.getCluster().describe()

查看集群状态

mysql-js> dba.getCluster().status()

Note: If you get following message, then it's time to reboot cluster from complete outage

Dba.getCluster: This function is not available through a session to a standalone instance (metadata exists, but GR is not active) (RuntimeError)

reboot from complete outage 即以下命令。

dba.rebootClusterFromCompleteOutage('<cluster_name>')

从仲裁丢失中恢复集群

集群处于 NO_QUORUM 状态。Re-establish quorum using the method cluster.forceQuorumUsingPartitionOf()

JS > cluster.forceQuorumUsingPartitionOf("root@<IP address>:<port>")

调试方法

启动 mysql-shell 时指定 log level。

mysqlsh --log-level=DEBUG3

unmanaged replication group

MySQL JS > dba.getCluster()

Dba.getCluster: This function is not available through a session to an instance belonging to an unmanaged replication group (RuntimeError)

InnoDB cluster status is different for mysqlsh and performance_schema.replicaion_group_members. In performance_schema it is ONLINE but mysqlsh depicts MISSING.（ Cluster.status() is out of sync with the Group Replication view replication_group_members）。

这种情况可能是集群尚未被 mysql-shell 接管，可以执行上文提到的接管集群的命令。或者

Restore auto.cnf form backup, restart mysqld. If require remove instance and add it back again.

About auto.cnf and server_uuid:auto.cnf file is similar to my.cnf. It contains server_uuid. Server_uuid is generated automatically. When starting MySQL it read and use server_uuid from auto.cnf. The value of the server_uuid used in reapplication. server_uuid is true UUID in an addition to user supplied server_id system variable.
About auto.cnf and server_uuid

参考资料

1. https://jeremyxu2010.github.io/2019/05/mysql-innodb-cluster实战/
2. https://kubedb.com/docs/v0.13.0-rc.0/guides/mysql/clustering/overview/
3. http://blog.itpub.net/26736162/viewspace-2675139/
4. https://juejin.im/post/5df4d1dc6fb9a0164577d0a8
5. http://shrenikp.blogspot.com/2018/10/mysql-innodb-cluster-troubleshooting.html
6. https://www.cnblogs.com/xinysu/p/6674832.html
7. https://dev.mysql.com/doc/refman/5.7/en/mysql-innodb-cluster-working-with-cluster.html
8. https://dba.stackexchange.com/questions/224117/mysqlsh-rebootclusterfromcompleteoutage-force-prompt
9. https://dev.mysql.com/doc/dev/mysqlsh-api-javascript/8.0/classmysqlsh_1_1dba_1_1_dba.html

知行近思

知行合一，切问近思

发表回复取消回复