Redis集群配置笔记

Redis单实例安装这里就不介绍了,如有需要可查阅本站之前的文章《Redis及PHP Redis扩展安装笔记》。这里假设Redis已经安装,并使用默认端口配置好,能正常使用。

第一部分 集群搭建

一、前置声明
0、环境说明

OS:    CentOS release 6.9
Redis: redis_version:4.0.2
Server IP: 10.235.25.241

1、Redis默认端口启动

/usr/local/bin/redis-server /etc/redis.conf
/usr/local/bin/redis-cli -h 10.235.25.241 -p 6379

10.235.25.241:6379> set salmonl niliu
OK
10.235.25.241:6379> get salmonl
"niliu"

二、集群搭建

0、配置节点
让集群正常运作至少需要3个主节点,这里采用官网建议使用6个节点(3主3从)演示。6个节点的端口分别设定为7000,7001, 7002, 7003, 7004, 7005。
0.1、创建集群配置目录

mkdir /etc/redis_cluster

0.2、创建7000端口节点配置

cp /etc/redis.conf /etc/redis_cluster/redis_7000.conf

# 修改配置
vim /etc/redis_cluster/redis_7000.conf

# 端口号
port 7000
# 后台启动
daemonize yes
# 开启集群
cluster-enabled yes
# 集群节点配置文件
cluster-config-file nodes_7000.conf
# 集群连接超时时间
cluster-node-timeout 5000
# 进程pid的文件位置
pidfile "/var/run/redis_7000.pid"
# 开启aof
appendonly yes
# aof文件路径
appendfilename "appendonly_7000.aof"
# rdb文件路径
dbfilename dump_7000.rdb

0.3、启动7000端口节点

/usr/local/bin/redis-server /etc/redis_cluster/redis_7000.conf

ps -ef | grep redis
root       924     1  0 Dec11 ?        00:01:14 /usr/bin/redis-server *:7000 [cluster]
root      5660     1  0 Dec09 ?        00:02:31 redis-server *:6379

0.4、配置并启动其他端口
上一部配置并启动成功了,所以配置OK。这一步直接copy上面的配置即可。这里以7001为例,其他端口完全一样。

cp /etc/redis_cluster/redis_7000.conf /etc/redis_cluster/redis_7001.conf
# vim批量替换
vim /etc/redis_cluster/redis_7001.conf
:%s/7000/7001/g

# 启动
/usr/local/bin/redis-server /etc/redis_cluster/redis_7000.conf
/usr/local/bin/redis-server /etc/redis_cluster/redis_7001.conf
/usr/local/bin/redis-server /etc/redis_cluster/redis_7002.conf
/usr/local/bin/redis-server /etc/redis_cluster/redis_7003.conf
/usr/local/bin/redis-server /etc/redis_cluster/redis_7004.conf
/usr/local/bin/redis-server /etc/redis_cluster/redis_7005.conf

ps -ef | grep redis
root       924     1  0 Dec11 ?        00:01:14 /usr/local/bin/redis-server *:7000 [cluster]
root       932     1  0 Dec11 ?        00:01:14 /usr/local/bin/redis-server *:7001 [cluster]
root       942     1  0 Dec11 ?        00:01:13 /usr/local/bin/redis-server *:7002 [cluster]
root       950     1  0 Dec11 ?        00:01:12 /usr/local/bin/redis-server *:7003 [cluster]
root       960     1  0 Dec11 ?        00:01:12 /usr/local/bin/redis-server *:7004 [cluster]
root       975     1  0 Dec11 ?        00:01:12 /usr/local/bin/redis-server *:7005 [cluster]
root      5660     1  0 Dec09 ?        00:02:31 redis-server *:6379

1、创建集群
redis提供了创建集群命令行工具redis-trib, redis源码中src目录下redis-trib.rb就是这个工具,它是一个ruby程序, 所以运行环境需要安装ruby。
redis-trib.rb这个程序通过向实例发送特殊命令来创建集群、检查集群、对集群重新分片。

1.1 查看是否安装ruby
使用ruby -v如果没有显示版本,说明没有安装

ruby -v

1.2、安装ruby
yum安装, 已安装可以跳过。

yum install ruby

# 验证
ruby -v
ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-linux]

1.3、安装redis.rb
redis.rb是redis的ruby客户端。安装ruby后,可通过gem(ruby包管理器,类似apt-get、yum)安装。
注:如安装异常,可查看第三部分问题汇总

gem install redis

2、启动集群
2.1、启动命令
通过redis-trib工具的create命令,选项–replicas 1表示为集群中的每个主节点创建从节点。之后其他参数则是集群实例地址列表。

redis-trib.rb create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005

2.2、启动集群
把redis-trib.rb工具复制到redis服务所在目录,这里重命名为redis-cluster,下文直接使用redis-cluster。

cp /usr/local/redis/redis-4.0.2/src/redis-trib.rb /usr/local/bin/redis-cluster
/usr/local/bin/redis-cluster create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005

>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
127.0.0.1:7000
127.0.0.1:7001
127.0.0.1:7002
Adding replica 127.0.0.1:7003 to 127.0.0.1:7000
Adding replica 127.0.0.1:7004 to 127.0.0.1:7001
Adding replica 127.0.0.1:7005 to 127.0.0.1:7002
M: 8f371f57ab8d76a1fab3f6140bd0561a801a5198 127.0.0.1:7000
   slots:0-5460 (5461 slots) master
M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
S: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   replicates 8f371f57ab8d76a1fab3f6140bd0561a801a5198
S: c5611bee8939281a6de120205df140258078a5b3 127.0.0.1:7004
   replicates 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
S: c88f6240d2430e4af9d9ab803047afeb331595f2 127.0.0.1:7005
   replicates 40dd2978f5377d2766e231f978f9241670727f20
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join...
>>> Performing Cluster Check (using node 127.0.0.1:7000)
M: 8f371f57ab8d76a1fab3f6140bd0561a801a5198 127.0.0.1:7000
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   slots: (0 slots) slave
   replicates 8f371f57ab8d76a1fab3f6140bd0561a801a5198
M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: c88f6240d2430e4af9d9ab803047afeb331595f2 127.0.0.1:7005
   slots: (0 slots) slave
   replicates 40dd2978f5377d2766e231f978f9241670727f20
S: c5611bee8939281a6de120205df140258078a5b3 127.0.0.1:7004
   slots: (0 slots) slave
   replicates 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

3、连接集群
redis-cli客户端已实现了基本的集群支持,通过redis-cli -c来启动

/usr/local/bin/redis-cli -h 10.235.25.241 -p 7001 -c
10.235.25.241:7000> set salmonl wcb
OK
10.235.25.241:7000> get salmonl
"wcb"
10.235.25.241:7000> set foo 100
-> Redirected to slot [12182] located at 127.0.0.1:7002
OK
127.0.0.1:7002> get foo
"100"

通过测试表示集群创建OK。

第二部分 集群操作

一、故障转移

0、查看节点
redis-trib.rb工具check命令可以查看,后面参数是集群中任一节点地址即可,用来表示集群。

./redis-trib.rb check 127.0.0.1:7000

通过查看7000端口是一个master节点

/usr/local/bin/redis-cluster check 10.235.25.241:7000
>>> Performing Cluster Check (using node 10.235.25.241:7000)
M: 8f371f57ab8d76a1fab3f6140bd0561a801a5198 10.235.25.241:7000
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   slots: (0 slots) slave
   replicates 8f371f57ab8d76a1fab3f6140bd0561a801a5198
M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: c88f6240d2430e4af9d9ab803047afeb331595f2 127.0.0.1:7005
   slots: (0 slots) slave
   replicates 40dd2978f5377d2766e231f978f9241670727f20
S: c5611bee8939281a6de120205df140258078a5b3 127.0.0.1:7004
   slots: (0 slots) slave
   replicates 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

1、单master故障
把7000 kill掉

ps -ef | grep redis
root       924     1  0 Dec11 ?        00:01:33 /usr/bin/redis-server *:7000 [cluster]
root       932     1  0 Dec11 ?        00:01:33 /usr/bin/redis-server *:7001 [cluster]
root       942     1  0 Dec11 ?        00:01:32 /usr/bin/redis-server *:7002 [cluster]
root       950     1  0 Dec11 ?        00:01:30 /usr/bin/redis-server *:7003 [cluster]
root       960     1  0 Dec11 ?        00:01:30 /usr/bin/redis-server *:7004 [cluster]
root       975     1  0 Dec11 ?        00:01:30 /usr/bin/redis-server *:7005 [cluster]
root      5660     1  0 Dec09 ?        00:02:41 redis-server *:6379

kill 924

# check 7000节点,发现确实挂了
/usr/local/bin/redis-cluster check 10.235.25.241:7000
[ERR] Sorry, can't connect to node 10.235.25.241:7000

# check 7001节点查看集群信息。发现7000的从库7003变为master
/usr/local/bin/redis-cluster check 10.235.25.241:7001
>>> Performing Cluster Check (using node 10.235.25.241:7001)
M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 10.235.25.241:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: c88f6240d2430e4af9d9ab803047afeb331595f2 127.0.0.1:7005
   slots: (0 slots) slave
   replicates 40dd2978f5377d2766e231f978f9241670727f20
M: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   slots:0-5460 (5461 slots) master
   0 additional replica(s)
M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: c5611bee8939281a6de120205df140258078a5b3 127.0.0.1:7004
   slots: (0 slots) slave
   replicates 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

2、单master故障恢复

# 重启启动7000节点
/usr/local/bin/redis-server /etc/redis_cluster/redis_7000.conf

# check 7000成了7003的slave
/usr/local/bin/redis-cluster check 10.235.25.241:7000
>>> Performing Cluster Check (using node 10.235.25.241:7000)
S: 8f371f57ab8d76a1fab3f6140bd0561a801a5198 10.235.25.241:7000
   slots: (0 slots) slave
   replicates ad849b173e7a27892727f9cf3182f5c899d95296
M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: c5611bee8939281a6de120205df140258078a5b3 127.0.0.1:7004
   slots: (0 slots) slave
   replicates 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
M: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: c88f6240d2430e4af9d9ab803047afeb331595f2 127.0.0.1:7005
   slots: (0 slots) slave
   replicates 40dd2978f5377d2766e231f978f9241670727f20
M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

3、增加一个master节点
3.1、按照上面的方式新增一个端口为7006的配置,并启动。

3.2、通过redis-cluster的add-node指令把新节点加入集群。
格式:redis-trib add-node ip:port1 ip:port2
说明:port1表示新节点,port2可以是集群中任意节点,用来表示集群

/usr/local/bin/redis-cluster add-node 127.0.0.1:7006 127.0.0.1:7000

>>> Adding node 127.0.0.1:7006 to cluster 127.0.0.1:7000
>>> Performing Cluster Check (using node 127.0.0.1:7000)
S: 8f371f57ab8d76a1fab3f6140bd0561a801a5198 127.0.0.1:7000
   slots: (0 slots) slave
   replicates ad849b173e7a27892727f9cf3182f5c899d95296
M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: c5611bee8939281a6de120205df140258078a5b3 127.0.0.1:7004
   slots: (0 slots) slave
   replicates 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: c88f6240d2430e4af9d9ab803047afeb331595f2 127.0.0.1:7005
   slots: (0 slots) slave
   replicates 40dd2978f5377d2766e231f978f9241670727f20
M: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1:7006 to make it join the cluster.
[OK] New node added correctly.

3.3、查看集群中新增节点

/usr/local/bin/redis-cluster check 127.0.0.1:7006
>>> Performing Cluster Check (using node 127.0.0.1:7006)
M: 2ca8f1414270690c3e278b27c36b4fe7d9224afe 127.0.0.1:7006
   slots: (0 slots) master
   0 additional replica(s)
M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: 8f371f57ab8d76a1fab3f6140bd0561a801a5198 127.0.0.1:7000
   slots: (0 slots) slave
   replicates ad849b173e7a27892727f9cf3182f5c899d95296
M: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: c88f6240d2430e4af9d9ab803047afeb331595f2 127.0.0.1:7005
   slots: (0 slots) slave
   replicates 40dd2978f5377d2766e231f978f9241670727f20
S: c5611bee8939281a6de120205df140258078a5b3 127.0.0.1:7004
   slots: (0 slots) slave
   replicates 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered

我们发现7006节点中slots为0: slots: (0 slots) master。这是正常的,需要重新分片。

3.4、重新分片
通过reshard指令完成, reshard后面只有是集群的节点都可。

/usr/local/bin/redis-cluster reshard 127.0.0.1:7000

执行以上命令后,会有4步交互:

# Step1: 提示输入How many slots do you want to move (from 1 to 16384)?

# Step2: What is the receiving node ID?
输入接收方ID: 2ca8f1414270690c3e278b27c36b4fe7d9224afe
注:这里必须是master节点ID,否则报错*** The specified node is not known or not a master, please retry.

# Step3: Please enter all the source node IDs.

Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1:all

# Step4: Do you want to proceed with the proposed reshard plan (yes/no)?
输入yes, 回车即可。

完整过程示例

How many slots do you want to move (from 1 to 16384)? 4096
What is the receiving node ID? 2ca8f1414270690c3e278b27c36b4fe7d9224afe
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1:all

Ready to move 4096 slots.
  Source nodes:
    M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
    M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 127.0.0.1:7001
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
    M: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
  Destination node:
    M: 2ca8f1414270690c3e278b27c36b4fe7d9224afe 127.0.0.1:7006
   slots: (0 slots) master
   0 additional replica(s)
  Resharding plan:
    Moving slot 5461 from 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
    Moving slot 5462 from 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
    Moving slot 5463 from 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5

3.5、查看槽位
如果不出意外,可以看到新增节点槽位会变成4096。
【这里因为redis.rb版本过高,导致reshard中途中断了。所以数据不完全一致。】

/usr/local/bin/redis-cluster check 127.0.0.1:7006
>>> Performing Cluster Check (using node 127.0.0.1:7006)
M: 2ca8f1414270690c3e278b27c36b4fe7d9224afe 127.0.0.1:7006
   slots:0-1625,5461-8045,10923-13432 (6721 slots) master
   0 additional replica(s)
M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:13433-16383 (2951 slots) master
   1 additional replica(s)
S: 8f371f57ab8d76a1fab3f6140bd0561a801a5198 127.0.0.1:7000
   slots: (0 slots) slave
   replicates ad849b173e7a27892727f9cf3182f5c899d95296
M: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   slots:1626-5460 (3835 slots) master
   1 additional replica(s)
S: c88f6240d2430e4af9d9ab803047afeb331595f2 127.0.0.1:7005
   slots: (0 slots) slave
   replicates 40dd2978f5377d2766e231f978f9241670727f20
S: c5611bee8939281a6de120205df140258078a5b3 127.0.0.1:7004
   slots: (0 slots) slave
   replicates 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 127.0.0.1:7001
   slots:8046-10922 (2877 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

这里是slots:0-1625,5461-8045,10923-13432 (6721 slots) master

4、增加一个slave节点
4.1、按照上面的方式新增一个端口为7007的配置,并启动。

4.2、增加slave节点7007
通过check命令查看7006 master-id, 并赋给–master-id。
依旧通过add-node添加。注意添加参数–slave 和 –master-id

/usr/local/bin/redis-cluster add-node --slave --master-id 2ca8f1414270690c3e278b27c36b4fe7d9224afe 127.0.0.1:7007 127.0.0.1:7000

>>> Send CLUSTER MEET to node 127.0.0.1:7007 to make it join the cluster.
Waiting for the cluster to join.
>>> Configure node as replica of 127.0.0.1:7006.
[OK] New node added correctly.

注:如果不指定–master-id, 会随机把新节点加在slave比较少的master上。【可自行验证】

4.3、查看新增节点

/usr/local/bin/redis-cluster check 127.0.0.1:7007
>>> Performing Cluster Check (using node 127.0.0.1:7007)
S: 5c12f1ec94d9da03365d8fafb34efa4443c53b9b 127.0.0.1:7007
   slots: (0 slots) slave
   replicates 2ca8f1414270690c3e278b27c36b4fe7d9224afe
M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:13433-16383 (2951 slots) master
   1 additional replica(s)
S: c5611bee8939281a6de120205df140258078a5b3 127.0.0.1:7004
   slots: (0 slots) slave
   replicates 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
M: 2ca8f1414270690c3e278b27c36b4fe7d9224afe 127.0.0.1:7006
   slots:0-1625,5461-8045,10923-13432 (6721 slots) master
   1 additional replica(s)
M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 127.0.0.1:7001
   slots:8046-10922 (2877 slots) master
   1 additional replica(s)
S: c88f6240d2430e4af9d9ab803047afeb331595f2 127.0.0.1:7005
   slots: (0 slots) slave
   replicates 40dd2978f5377d2766e231f978f9241670727f20
S: 8f371f57ab8d76a1fab3f6140bd0561a801a5198 127.0.0.1:7000
   slots: (0 slots) slave
   replicates ad849b173e7a27892727f9cf3182f5c899d95296
M: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   slots:1626-5460 (3835 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

5、删除一个master节点
5.1、删除节点命令

redis-trib del-node 127.0.0.1:7000 `<node-id>`

5.2、节点7001重新分片
如果节点不为空,需要先Reshard,把7001节点上的slots全部移动到其他master上。

/usr/local/bin/redis-cluster reshard 127.0.0.1:7001
>>> Performing Cluster Check (using node 127.0.0.1:7001)
M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 127.0.0.1:7001
   slots:9751-10922 (1172 slots) master
   1 additional replica(s)
M: 2ca8f1414270690c3e278b27c36b4fe7d9224afe 127.0.0.1:7006
   slots:7609-8045,9660-9750,10923-13432 (3038 slots) master
   1 additional replica(s)
M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:950-3598,5461-7295,8046-9525,13433-16383 (8915 slots) master
   1 additional replica(s)
S: c5611bee8939281a6de120205df140258078a5b3 127.0.0.1:7004
   slots: (0 slots) slave
   replicates 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
S: 5c12f1ec94d9da03365d8fafb34efa4443c53b9b 127.0.0.1:7007
   slots: (0 slots) slave
   replicates 2ca8f1414270690c3e278b27c36b4fe7d9224afe
S: c88f6240d2430e4af9d9ab803047afeb331595f2 127.0.0.1:7005
   slots: (0 slots) slave
   replicates 40dd2978f5377d2766e231f978f9241670727f20
M: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   slots:0-949,3599-5460,7296-7608,9526-9659 (3259 slots) master
   1 additional replica(s)
S: 8f371f57ab8d76a1fab3f6140bd0561a801a5198 127.0.0.1:7000
   slots: (0 slots) slave
   replicates ad849b173e7a27892727f9cf3182f5c899d95296
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 1172
What is the receiving node ID? 2ca8f1414270690c3e278b27c36b4fe7d9224afe
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1:5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
Source node #2:done

注意:这里是把7001上1172 slots移动到7006是。所以上面重新分片step3(见上文增加master节点3.4)填的不是all而是7006的ID, 同时输入done。

5.3、使用del-node即可

/usr/local/bin/redis-cluster del-node 127.0.0.1:7001 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5

>>> Removing node 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 from cluster 127.0.0.1:7001
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.

注:不空为直接del,会报错

/usr/local/bin/redis-cluster del-node 127.0.0.1:7001 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
>>> Removing node 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 from cluster 127.0.0.1:7001
[ERR] Node 127.0.0.1:7001 is not empty! Reshard data away and try again.

5.4、验证

/usr/local/bin/redis-cluster check 127.0.0.1:7002

6、删除一个slave节点
6.1、删除7007节点
slave没有slots, 直接删除即可
del-node后面的参数节点只要是集群节点即可。
/usr/local/bin/redis-cluster del-node 127.0.0.1:7007 5c12f1ec94d9da03365d8fafb34efa4443c53b9b

/usr/local/bin/redis-cluster del-node 127.0.0.1:7007 5c12f1ec94d9da03365d8fafb34efa4443c53b9b
>>> Removing node 5c12f1ec94d9da03365d8fafb34efa4443c53b9b from cluster 127.0.0.1:7002
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.

6.2、验证

/usr/local/bin/redis-cluster check 127.0.0.1:7007
[ERR] Sorry, can't connect to node 127.0.0.1:7007

第三部分 问题汇总

一、安装rubygem redis依赖报错

gem install redis
ERROR:  Error installing redis:
	redis requires Ruby version >= 2.3.0.

ruby版本过低,需要升级, 通过源码安装ruby

二、源码安装ruby
安装rubygem redis依赖版本如果因为ruby版本过低导致失败,可通过源码安装。
0、下载
源码安装【在ruby官网下载稳定版

wget https://cache.ruby-lang.org/pub/ruby/2.6/ruby-2.6.5.tar.gz

1、删除已安装版本

yum erase ruby ruby-libs ruby-mode ruby-rdoc ruby-irb ruby-ri ruby-docs

2、编译安装

tar -zxvf ruby-2.6.5.tar.gz
cd ruby-2.6.5
./configure
make && make install

3、验证

ruby
/usr/bin/env: ruby: 没有那个文件或目录

默认安装路径
/usr/local/bin/ruby
/usr/local/bin/gem

需要完整命令路径查看

/usr/local/bin/ruby -v
ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-linux]

4、添加软连接
redis.rb在使用时,默认目录是/usr/bin/ruby,需要加软链。alias不行。

ln -s /usr/local/bin/ruby /usr/bin/ruby

ruby -v
ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-linux]

5、安装redis.rb

/usr/local/bin/gem install redis
Fetching redis-4.1.3.gem
Successfully installed redis-4.1.3
Parsing documentation for redis-4.1.3
Installing ri documentation for redis-4.1.3
Done installing documentation for redis after 0 seconds
1 gem installed

三、重新分片报错
[ERR] Calling MIGRATE: ERR Syntax error, try CLIENT
0、演示如下

/usr/local/bin/redis-cluster reshard 127.0.0.1:7000

...
...
Moving slot 12180 from 127.0.0.1:7002 to 127.0.0.1:7006:
Moving slot 12181 from 127.0.0.1:7002 to 127.0.0.1:7006:
Moving slot 12182 from 127.0.0.1:7002 to 127.0.0.1:7006:
[ERR] Calling MIGRATE: ERR Syntax error, try CLIENT (LIST | KILL | GETNAME | SETNAME | PAUSE | REPLY)

1、原因分析
根据这篇文章说明,这个ERR是因为redis.rb版本过高,需要安装低版本。

redis.rb v4.0.1 downgrade to v3.3.3

2、查看redis.rb版本

/usr/local/bin/gem list | grep redis
redis (4.1.3)

3、卸载当前版本

/usr/local/bin/gem uninstall redis --version 4.1.3
Successfully uninstalled redis-4.1.3

4、安装低版本

/usr/local/bin/gem install redis -v 3.3.3
Fetching redis-3.3.3.gem
Successfully installed redis-3.3.3
Parsing documentation for redis-3.3.3
Installing ri documentation for redis-3.3.3
Done installing documentation for redis after 0 seconds
1 gem installed

四、重新分片过程异常修复
0、reshard的时候提示*** Please fix your cluster problems before resharding

/usr/local/bin/redis-cluster reshard 127.0.0.1:7000
>>> Performing Cluster Check (using node 127.0.0.1:7000)
S: 8f371f57ab8d76a1fab3f6140bd0561a801a5198 127.0.0.1:7000
   slots: (0 slots) slave
   replicates ad849b173e7a27892727f9cf3182f5c899d95296
M: 40dd2978f5377d2766e231f978f9241670727f20 127.0.0.1:7002
   slots:12182-16383 (4202 slots) master
   1 additional replica(s)
S: c5611bee8939281a6de120205df140258078a5b3 127.0.0.1:7004
   slots: (0 slots) slave
   replicates 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5
M: 5dd31c68bd21dcc5fa445679b418c9fdc3160fc5 127.0.0.1:7001
   slots:6827-10922 (4096 slots) master
   1 additional replica(s)
M: 2ca8f1414270690c3e278b27c36b4fe7d9224afe 127.0.0.1:7006
   slots:5461-6826,10923-12181 (2625 slots) master
   0 additional replica(s)
S: c88f6240d2430e4af9d9ab803047afeb331595f2 127.0.0.1:7005
   slots: (0 slots) slave
   replicates 40dd2978f5377d2766e231f978f9241670727f20
M: ad849b173e7a27892727f9cf3182f5c899d95296 127.0.0.1:7003
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
[WARNING] Node 127.0.0.1:7002 has slots in migrating state (12182).
[WARNING] Node 127.0.0.1:7006 has slots in importing state (12182).
[WARNING] The following slots are open: 12182
>>> Check slots coverage...
[OK] All 16384 slots covered.
*** Please fix your cluster problems before resharding

1、修复

/usr/local/bin/redis-cluster fix 127.0.0.1:7000

2、再次reshard

/usr/local/bin/redis-cluster reshard 127.0.0.1:7000

参考

Redis集群的原理和搭建
深入剖析Redis系列(三) – Redis集群模式搭建与原理详解
Redis Cluster在线迁移
redis.rb 4.0.0 compatibility issue

发表评论

电子邮件地址不会被公开。 必填项已用*标注