集群分片

redis_cluster图片

每个节点都可以保存数据,数据是保存在主节点的,每个从节点只是主节点的一个复制,容灾和复制的作用。

为了集群的高可用,所以在每一个主节点后都配置了一个从节点,这样就当某个主节点挂了,从节点可用顶替主节点,继续提供服务,不至于整个集群挂了。

M1,M2,M3 — 主节点 1,主节点 2,主节点 3

S1,S2,S3 — 从节点 1,从节点 2,从节点 3

集群要是多个节点,官方建议,至少 3 主 3 从,主节点会被分配插槽来存储数据,每个 key 都会进行 hash 计算,找到一个插槽的索引进行存储,一共是 16384 个插槽。

Redis-cluster 集群通过 Gossip 协议进行信息广播。

优点

1. 可扩展性(数据按插槽分配在多个节点上的)
2. 高可用性(部分节点不可用,可用通过Gossip协议,然后投票,让从节点变成主节点顶上去,保证集群的可用性)
3. 自动故障转义
4. 去中性化

缺点

1. 异步复制数据,无法保证数据实时强一致性(保证数据最终一致性)
2. 集群搭建有成本。

数据如何分区

redis-cluster 集群采用分布式存储。

好处:

  1. 可用性高(一个节点挂了,其他节点的数据还可以用)
  2. 容易维护(一个节点挂了,只需要处理该节点就可以)
  3. 均衡 I/O(不同的请求,映射到不同的节点访问,改善整个系统性能)
  4. 性能高(只关心自己关心的节点,提高搜索速度)

分区算法

  1. 范围分区 :同一个范围的范围查询不需要跨节点,提升查询速度。比如 MySQL,Oracle.

  2. 节点取余分区:hash(key) %N。实现简单,但是当扩容或缩容的时候,需要迁移的数据量大。

  3. 一致性哈希分区

    一致性hash

加入和删除节点只影响哈希环中相邻的节点,对其他节点无影响。但是当使用少量节点的时候,节点变化将大范围影响哈希环中数据映射,因此这种方式不适合少量数据节点的分布式方案。

  1. 虚拟槽分区

    每个节点均匀的分配了槽,减少了扩容和缩容节点影响的范围。但是需要存储节点和槽的对应信息。

    虚拟槽分区

集群环境搭建

docker-compose.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
version: "3.0"
services:
redis-1:
image: redis:6.2
container_name: redis-1
ports:
- "6379:6379"
- "16379:16379" # Cluster bus port The default is redis Port plus 1000, Each node should be opened
volumes:
- $PWD/redis-1/redis.conf:/redis.conf
- $PWD/redis-1/data:/data
command: redis-server /redis.conf
redis-2:
image: redis:6.2
container_name: redis-2
ports:
- "6380:6380"
- "16380:16380" # Cluster bus port The default is redis Port plus 1000, Each node should be opened
volumes:
- $PWD/redis-2/redis.conf:/redis.conf
- $PWD/redis-2/data:/data
command: redis-server /redis.conf
redis-3:
image: redis:6.2
container_name: redis-3
ports:
- "6381:6381"
- "16381:16381" # Cluster bus port The default is redis Port plus 1000, Each node should be opened
volumes:
- $PWD/redis-3/redis.conf:/redis.conf
- $PWD/redis-3/data:/data
command: redis-server /redis.conf
redis-4:
image: redis:6.2
container_name: redis-4
ports:
- "6382:6382"
- "16382:16382" # Cluster bus port The default is redis Port plus 1000, Each node should be opened
volumes:
- $PWD/redis-4/redis.conf:/redis.conf
- $PWD/redis-4/data:/data
command: redis-server /redis.conf
redis-5:
image: redis:6.2
container_name: redis-5
ports:
- "6383:6383"
- "16383:16383" # Cluster bus port The default is redis Port plus 1000, Each node should be opened
volumes:
- $PWD/redis-5/redis.conf:/redis.conf
- $PWD/redis-5/data:/data
command: redis-server /redis.conf
redis-6:
image: redis:6.2
container_name: redis-6
ports:
- "6384:6384"
- "16384:16384"
volumes:
- $PWD/redis-6/redis.conf:/redis.conf
- $PWD/redis-6/data:/data
command: redis-server /redis.conf

Redis.conf

1
2
3
4
5
6
bind 0.0.0.0
port 端口
cluster-enabled yes
cluster-config-file nodes-6379.conf
cluster-node-timeout 15000
cluster-announce-ip 你自己的ip

配置 Redis-cluster 集群

1
redis-cli --cluster create --cluster-replicas 1 192.168.0.100:6379 192.168.0.100:6380 192.168.0.100:6381 192.168.0.100:6382 192.168.0.100:6383 192.168.0.100:6384

如下图:

redis-cluster-1

redis-cluster-2

主节点:6379 槽:0-5460

主节点:6380 槽:5461-10922

主节点:6381 槽:10923-16383

单节点与集群性能测试

性能测试一定要参考硬件,否则就是耍流氓。为什么这么说呢,比如一个 8 核 16G 的服务器和 1 核 1G 的服务器分别做性能测试,一定是相同配置的服务器,跑不同的测试用例。

序号 选项 描述 默认值
1 -h 指定服务器主机名 127.0.0.1
2 -p 指定服务器端口 6379
3 -s 指定服务器 socket
4 -c 指定并发连接数 50
5 -n 指定请求数 10000
6 -d 以字节的形式指定 SET/GET 值的数据大小 2
7 -k 1=keep alive 0=reconnect 1
8 -r SET/GET/INCR 使用随机 key, SADD 使用随机值
9 -P 通过管道传输 请求 1
10 -q 强制退出 redis。仅显示 query/sec 值
11 –csv 以 CSV 格式输出
12 *-l*(L 的小写字母) 生成循环,永久执行测试
13 -t 仅运行以逗号分隔的测试命令列表。
14 *-I*(i 的大写字母) Idle 模式。仅打开 N 个 idle 连接并等待。

单机测试

启动一个单节点。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
docker exec -it my-redis /bin/bash

cd /usr/local/bin #进入bin目录

redis-benchmark -h 127.0.0.1 -p 6379 -t set,get -r 1000000 -n 100000 -c 1000

====== SET ======
100000 requests completed in 1.45 seconds
1000 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": yes
multi-thread: no

Summary:
throughput summary: 69108.50 requests per second
latency summary (msec):
avg min p50 p95 p99 max
10.145 3.712 9.999 13.903 18.255 30.447


====== GET ======
100000 requests completed in 0.98 seconds
1000 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": yes
multi-thread: no


Summary:
throughput summary: 101626.02 requests per second
latency summary (msec):
avg min p50 p95 p99 max
4.947 2.400 4.903 6.031 6.503 8.399

集群测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
进入redis-1容器
docker exec -it redis-1 /bin/baah

创建集群
redis-cli --cluster create --cluster-replicas 1 192.168.0.101:6379 192.168.0.101:6380 192.168.0.101:6381 192.168.0.101:6382 192.168.0.101:6383 192.168.0.101:6384

开始测试
redis-benchmark --cluster -h 192.168.0.101 -p 6379 -t set,get -r 1000000 -n 1000000 -c 1000
------------------------------------------------------------------------

Cluster has 3 master nodes:

Master 0: a8e635a0b08c44af412bda2aa9a84a66e58e830d 192.168.0.101:6380
Master 1: 2106cc10e6b285d55e511194c9456bd066664564 192.168.0.101:6381
Master 2: bddf5e622812b6f10be30a21833eb4be91a63cda 192.168.0.101:6379

====== SET ======
1000000 requests completed in 53.49 seconds
1000 parallel clients
3 bytes payload
keep alive: 1
cluster mode: yes (3 masters)
node [0] configuration:
save: 3600 1 300 100 60 10000
appendonly: no
node [1] configuration:
save: 3600 1 300 100 60 10000
appendonly: no
node [2] configuration:
save: 3600 1 300 100 60 10000
appendonly: no
multi-thread: yes
threads: 3

Summary:
throughput summary: 18696.48 requests per second
latency summary (msec):
avg min p50 p95 p99 max
51.964 4.728 50.143 70.079 80.703 1341.439
====== GET ======
1000000 requests completed in 53.64 seconds
1000 parallel clients
3 bytes payload
keep alive: 1
cluster mode: yes (3 masters)
node [0] configuration:
save: 3600 1 300 100 60 10000
appendonly: no
node [1] configuration:
save: 3600 1 300 100 60 10000
appendonly: no
node [2] configuration:
save: 3600 1 300 100 60 10000
appendonly: no
multi-thread: yes
threads: 3

Summary:
throughput summary: 18643.50 requests per second
latency summary (msec):
avg min p50 p95 p99 max
51.083 8.928 49.215 71.615 83.583 1342.463

单节点统计

Summary:
throughput summary: 101626.02 requests per second

集群统计

Summary:
throughput summary: 18643.50 requests per second

单点要比集群性能更好,大家可以思考一下,为什么?

Redis 集群原理

redis 没有采用一致性(哈希环),如果添加和删除 Node 的时候,只影响临近的 Node,但是 Node 比较少的时候,涉及到数据分担不均衡,并且添加和删除数据也是影响范围很大。节点越多(几千或上万),影响范围越小。

采用了哈希槽

  1. 哈希算法,不是简单的 hash 算法,是 crc16 算法,是一种校验算法。
  2. 槽位的空间分配。(类似 windows 分区,可用分配 C,D,E 盘,并且每个盘都可以自定义大小) 对于槽位的转移和分配,Redis 是人工配置的,不会自动进行。所以 Redis 集群的高可用是依赖于节点的主从复制与主从间的自动故障转移。槽位共 16384 个。集群中每个节点只负责一部分槽位,槽位的信息存储于每个节点中。

槽位计算公式

1
slot = CRC16(key) mod 16384

为什么只有 16384 个槽?

crc16()方法有 2 的 16 次方-1,也就是 65535,也就是 0-65535,共 65536 个槽。那为什么不是对 65536 取余,而是用 16384 取余。

  1. 如果 65536 个槽,那么发送心跳包的 header 会是 8k,那么心跳包每秒都要发送,发送数据过大,太占带宽。
  2. redis 集群不会超过 1000 个节点。 没有公司能达到这种数据量。如果节点都不会超过 1000 个,那么就没有必要用到 65536 个槽。
  3. 压缩率高。(16384 个槽和 65536 个槽比,槽位少的,压缩率自然就高了)

添加主节点

添加一个节点,需要从其他节点的插槽拿过来分配。

  1. 新建 redis-7 目录

  2. 拷贝 redis.conf,并修改端口为 6385

  3. 修改 docker-compose.yml 增加 redis-7 节点

  4. docker-compose up

    启动增加redis-7

  5. 进入 redis-1

1
docker exec -it redis-1 /bin/bash
  1. 查看集群信息

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    redis-cli cluster info

    cluster_state:ok
    cluster_slots_assigned:16384
    cluster_slots_ok:16384
    cluster_slots_pfail:0
    cluster_slots_fail:0
    cluster_known_nodes:6
    cluster_size:3
    cluster_current_epoch:6
    cluster_my_epoch:1
    cluster_stats_messages_ping_sent:346
    cluster_stats_messages_pong_sent:373
    cluster_stats_messages_sent:719
    cluster_stats_messages_ping_received:373
    cluster_stats_messages_pong_received:346
    cluster_stats_messages_received:719
  2. 查看集群节点信息

    1
    2
    3
    4
    5
    6
    7
    8
    redis-cli cluster nodes

    09d006ff1c193057e5b4969a3854dfa85929f382 192.168.0.101:6382@16382 slave 2106cc10e6b285d55e511194c9456bd066664564 0 1661952829000 3 connected
    bf2a9306c2ebc01405d84d1ec8d6d85ffdcdef83 192.168.0.101:6384@16384 slave a8e635a0b08c44af412bda2aa9a84a66e58e830d 0 1661952829000 2 connected
    d6033f945a5d8391fea832ef1ed98ec018062d35 192.168.0.101:6383@16383 slave bddf5e622812b6f10be30a21833eb4be91a63cda 0 1661952829000 1 connected
    a8e635a0b08c44af412bda2aa9a84a66e58e830d 192.168.0.101:6380@16380 master - 0 1661952829926 2 connected 5461-10922
    2106cc10e6b285d55e511194c9456bd066664564 192.168.0.101:6381@16381 master - 0 1661952830934 3 connected 10923-16383
    bddf5e622812b6f10be30a21833eb4be91a63cda 192.168.0.101:6379@16379 myself,master - 0 1661952827000 1 connected 0-5460
  3. 添加主节点,语法如下

    1
    2
    3
    redis-cli --cluster add-node new_host:new_port 最后节点的Host:最后节点的Port --cluster-master-id 最后节点的Id

    redis-cli --cluster add-node 192.168.0.101:6385 192.168.0.101:6381 --cluster-master-id 2106cc10e6b285d55e511194c9456bd066664564

    添加redis-7成功

重新查看最新节点

节点添加完了,但是没有分片,也就是没有槽,那就要从其他节点找到一些槽,分配到这个节点上。

1
2
3
redis-cli --cluster reshard host:port --cluster-from node_id --cluster-to node_id --cluster-slots 需要分配的槽个数 --cluster-yes

redis-cli --cluster reshard 172.18.0.7:6379 --cluster-from a4b3e461d95d09eb1e991d8ac910b9456db64af6 --cluster-to ac7673c66c50b547beee9df3d0781d60573fb701 --cluster-slots 2000 --cluster-yes

移动成功

再次查看节点信息,槽位是否分片成功?

1
redis-cli cluster nodes

分配成功后重新查看

可以看到 6385 节点上 0-1999 个槽。

添加从节点并组成主从结构

格式如下

1
2
3
redis-cli --cluster add-node new_host:new_port 给要添加的主节点的Host:给要添加的主节点的Port --cluster-slave --cluster-master-id 给要添加的主节点的Id

redis-cli --cluster add-node 192.168.0.101:6386 192.168.0.101:6385 --cluster-slave --cluster-master-id ac7673c66c50b547beee9df3d0781d60573fb701

添加从节点成功

如果是从节点,直接删掉,如果是主节点,要把槽放到其他节点上,再删除,保证数据安全。

删除从节点

  1. 从 docker-compose.yml 节点中移除。
  2. 移除对应的目录文件
1
redis-cli -a 123456 --cluster del-node 要删除节点的Host 要删除节点的Port 要删除节点的ID

删除主节点

直接删除主节点,要有数据丢失的风险。

  1. 移除 docker-compose.yml 节点。
  2. 重新分片,把节点的槽分配到其他节点上,否则会丢数据。
1
redis-cli --cluster reshard host:port --cluster-from node_id --cluster-to node_id --cluster-slots 需要分配的槽个数 --cluster-yes
  1. 执行删除从节点的命令
1
redis-cli --cluster del-node 要删除节点的Host 要删除节点的Port 要删除节点的ID

自动故障转移

找到 1 主(redis-3)和 1 从(redis-4),我们把 redis-3 停掉,然后看看 redis-4 是否可以变成主节点

1
docker stop redis-3

查看 redis-4 日志

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
docker logs redis-4
-------------------------------------------------
1:C 01 Sep 2022 01:19:25.330 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 01 Sep 2022 01:19:25.330 # Redis version=6.2.7, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 01 Sep 2022 01:19:25.330 # Configuration loaded
1:M 01 Sep 2022 01:19:25.331 * monotonic clock: POSIX clock_gettime
1:M 01 Sep 2022 01:19:25.342 * Node configuration loaded, I'm 09d006ff1c193057e5b4969a3854dfa85929f382
1:M 01 Sep 2022 01:19:25.342 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
1:M 01 Sep 2022 01:19:25.342 * Running mode=cluster, port=6382.
1:M 01 Sep 2022 01:19:25.342 # Server initialized
1:M 01 Sep 2022 01:19:25.352 * Loading RDB produced by version 6.2.7
1:M 01 Sep 2022 01:19:25.352 * RDB age 10 seconds
1:M 01 Sep 2022 01:19:25.352 * RDB memory usage when created 31.16 Mb
1:M 01 Sep 2022 01:19:26.177 # Done loading RDB, keys loaded: 284061, keys expired: 0.
1:M 01 Sep 2022 01:19:26.177 * DB loaded from disk: 0.833 seconds
1:M 01 Sep 2022 01:19:26.177 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
1:M 01 Sep 2022 01:19:26.177 * Ready to accept connections
1:S 01 Sep 2022 01:19:26.178 * Discarding previously cached master state.
1:S 01 Sep 2022 01:19:26.178 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
1:S 01 Sep 2022 01:19:26.178 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 01:19:26.178 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 01:19:26.178 # Cluster state changed: ok
1:S 01 Sep 2022 01:19:26.179 * Non blocking connect for SYNC fired the event.
1:S 01 Sep 2022 01:19:26.184 * Master replied to PING, replication can continue...
1:S 01 Sep 2022 01:19:26.187 * Trying a partial resynchronization (request 3fd4e6de6eef88188b18361afbe8ed6a36b28437:8555).
1:S 01 Sep 2022 01:19:26.191 * Full resync from master: aa669c5247747f4721ab0f45e1ce8d76101f0748:0
1:S 01 Sep 2022 01:19:26.191 * Discarding previously cached master state.
1:S 01 Sep 2022 01:19:29.393 * MASTER <-> REPLICA sync: receiving 7953892 bytes from master to disk
1:S 01 Sep 2022 01:19:31.334 * MASTER <-> REPLICA sync: Flushing old data
1:S 01 Sep 2022 01:19:31.386 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 01 Sep 2022 01:19:31.395 * Loading RDB produced by version 6.2.7
1:S 01 Sep 2022 01:19:31.395 * RDB age 5 seconds
1:S 01 Sep 2022 01:19:31.395 * RDB memory usage when created 32.26 Mb
1:S 01 Sep 2022 01:19:31.716 # Done loading RDB, keys loaded: 284061, keys expired: 0.
1:S 01 Sep 2022 01:19:31.716 * MASTER <-> REPLICA sync: Finished with success
1:S 01 Sep 2022 03:05:31.415 * 1 changes in 3600 seconds. Saving...
1:S 01 Sep 2022 03:05:31.417 * Background saving started by pid 23
23:C 01 Sep 2022 03:05:33.797 * DB saved on disk
23:C 01 Sep 2022 03:05:33.797 * RDB: 0 MB of memory used by copy-on-write
1:S 01 Sep 2022 03:05:33.827 * Background saving terminated with success
1:S 01 Sep 2022 03:09:28.056 # Connection with master lost.
1:S 01 Sep 2022 03:09:28.056 * Caching the disconnected master state.
1:S 01 Sep 2022 03:09:28.056 * Reconnecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:28.056 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:28.057 * Non blocking connect for SYNC fired the event.
1:S 01 Sep 2022 03:09:28.058 # Error reply to PING from master: '-Reading from master: Operation now in progress'
1:S 01 Sep 2022 03:09:29.004 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:29.005 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:29.005 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:30.015 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:30.015 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:30.016 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:31.023 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:31.023 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:31.024 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:32.032 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:32.032 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:32.032 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:33.043 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:33.043 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:33.044 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:34.052 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:34.052 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:34.053 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:35.063 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:35.063 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:35.064 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:36.073 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:36.073 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:36.074 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:37.082 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:37.082 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:37.083 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:38.092 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:38.092 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:38.093 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:39.106 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:39.106 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:39.108 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:40.119 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:40.119 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:40.120 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:41.135 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:41.135 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:41.136 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:42.152 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:42.152 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:42.153 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:43.164 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:43.165 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:43.166 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:44.176 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:44.176 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:44.177 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:45.203 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:45.203 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:45.204 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:46.212 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:46.212 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:46.212 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:47.024 * FAIL message received from bf2a9306c2ebc01405d84d1ec8d6d85ffdcdef83 about 2106cc10e6b285d55e511194c9456bd066664564
1:S 01 Sep 2022 03:09:47.024 # Cluster state changed: fail
1:S 01 Sep 2022 03:09:47.125 # Start of election delayed for 961 milliseconds (rank #0, offset 9292).
1:S 01 Sep 2022 03:09:47.226 * Connecting to MASTER 192.168.0.101:6381
1:S 01 Sep 2022 03:09:47.226 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:09:47.226 # Error condition on socket for SYNC: Connection refused
1:S 01 Sep 2022 03:09:48.133 # Starting a failover election for epoch 8.
1:S 01 Sep 2022 03:09:48.147 # Failover election won: I'm the new master.
1:S 01 Sep 2022 03:09:48.147 # configEpoch set to 8 after successful failover
1:M 01 Sep 2022 03:09:48.147 * Discarding previously cached master state.
1:M 01 Sep 2022 03:09:48.147 # Setting secondary replication ID to aa669c5247747f4721ab0f45e1ce8d76101f0748, valid up to offset: 9293. New replication ID is 5eca7aa6bca9534e75e85bff2aca7211113eb1fb
1:M 01 Sep 2022 03:09:48.148 # Cluster state changed: ok

上述表示 Cluster state changed: ok 集群已经重新搭建完成。

再次查看集群状态,6381 是 fail

redis6381失败

现在我们重新启动 6381

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
docker logs redis-3
------------------------------------------------------------------------------
1:C 01 Sep 2022 01:19:25.066 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 01 Sep 2022 01:19:25.066 # Redis version=6.2.7, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 01 Sep 2022 01:19:25.066 # Configuration loaded
1:M 01 Sep 2022 01:19:25.067 * monotonic clock: POSIX clock_gettime
1:M 01 Sep 2022 01:19:25.071 * Node configuration loaded, I'm 2106cc10e6b285d55e511194c9456bd066664564
1:M 01 Sep 2022 01:19:25.071 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
1:M 01 Sep 2022 01:19:25.071 * Running mode=cluster, port=6381.
1:M 01 Sep 2022 01:19:25.071 # Server initialized
1:M 01 Sep 2022 01:19:25.075 * Loading RDB produced by version 6.2.7
1:M 01 Sep 2022 01:19:25.075 * RDB age 10 seconds
1:M 01 Sep 2022 01:19:25.075 * RDB memory usage when created 31.14 Mb
1:M 01 Sep 2022 01:19:25.566 # Done loading RDB, keys loaded: 284061, keys expired: 0.
1:M 01 Sep 2022 01:19:25.566 * DB loaded from disk: 0.495 seconds
1:M 01 Sep 2022 01:19:25.567 * Ready to accept connections
1:M 01 Sep 2022 01:19:26.189 * Replica 192.168.16.1:6382 asks for synchronization
1:M 01 Sep 2022 01:19:26.189 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '3fd4e6de6eef88188b18361afbe8ed6a36b28437', my replication IDs are '16089dc0096b9bb777ed1fa6fe1040e19f934812' and '0000000000000000000000000000000000000000')
1:M 01 Sep 2022 01:19:26.189 * Replication backlog created, my new replication IDs are 'aa669c5247747f4721ab0f45e1ce8d76101f0748' and '0000000000000000000000000000000000000000'
1:M 01 Sep 2022 01:19:26.189 * Starting BGSAVE for SYNC with target: disk
1:M 01 Sep 2022 01:19:26.190 * Background saving started by pid 23
1:M 01 Sep 2022 01:19:27.580 # Cluster state changed: ok
23:C 01 Sep 2022 01:19:29.300 * DB saved on disk
23:C 01 Sep 2022 01:19:29.301 * RDB: 0 MB of memory used by copy-on-write
1:M 01 Sep 2022 01:19:29.389 * Background saving terminated with success
1:M 01 Sep 2022 01:19:30.089 * Synchronization with replica 192.168.16.1:6382 succeeded
1:M 01 Sep 2022 03:05:31.415 * 1 changes in 3600 seconds. Saving...
1:M 01 Sep 2022 03:05:31.416 * Background saving started by pid 24
24:C 01 Sep 2022 03:05:33.804 * DB saved on disk
24:C 01 Sep 2022 03:05:33.804 * RDB: 0 MB of memory used by copy-on-write
1:M 01 Sep 2022 03:05:33.827 * Background saving terminated with success
1:signal-handler (1662001765) Received SIGTERM scheduling shutdown...
1:M 01 Sep 2022 03:09:25.990 # User requested shutdown...
1:M 01 Sep 2022 03:09:25.990 * Saving the final RDB snapshot before exiting.
1:M 01 Sep 2022 03:09:28.046 * DB saved on disk
1:M 01 Sep 2022 03:09:28.046 # Redis is now ready to exit, bye bye...
1:C 01 Sep 2022 03:18:18.736 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 01 Sep 2022 03:18:18.736 # Redis version=6.2.7, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 01 Sep 2022 03:18:18.736 # Configuration loaded
1:M 01 Sep 2022 03:18:18.736 * monotonic clock: POSIX clock_gettime
1:M 01 Sep 2022 03:18:18.740 * Node configuration loaded, I'm 2106cc10e6b285d55e511194c9456bd066664564
1:M 01 Sep 2022 03:18:18.740 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
1:M 01 Sep 2022 03:18:18.740 * Running mode=cluster, port=6381.
1:M 01 Sep 2022 03:18:18.740 # Server initialized
1:M 01 Sep 2022 03:18:18.745 * Loading RDB produced by version 6.2.7
1:M 01 Sep 2022 03:18:18.745 * RDB age 533 seconds
1:M 01 Sep 2022 03:18:18.745 * RDB memory usage when created 32.28 Mb
1:M 01 Sep 2022 03:18:19.181 # Done loading RDB, keys loaded: 284065, keys expired: 0.
1:M 01 Sep 2022 03:18:19.181 * DB loaded from disk: 0.440 seconds
1:M 01 Sep 2022 03:18:19.181 * Ready to accept connections
1:M 01 Sep 2022 03:18:19.183 # Configuration change detected. Reconfiguring myself as a replica of 09d006ff1c193057e5b4969a3854dfa85929f382
1:S 01 Sep 2022 03:18:19.183 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
1:S 01 Sep 2022 03:18:19.183 * Connecting to MASTER 192.168.0.101:6382
1:S 01 Sep 2022 03:18:19.184 * MASTER <-> REPLICA sync started
1:S 01 Sep 2022 03:18:19.185 # Cluster state changed: ok
1:S 01 Sep 2022 03:18:19.190 * Non blocking connect for SYNC fired the event.
1:S 01 Sep 2022 03:18:19.203 * Master replied to PING, replication can continue...
1:S 01 Sep 2022 03:18:19.205 * Trying a partial resynchronization (request 23b6898179e974baea9642acefdef8aecbc4f201:1).
1:S 01 Sep 2022 03:18:19.207 * Full resync from master: 5eca7aa6bca9534e75e85bff2aca7211113eb1fb:9292
1:S 01 Sep 2022 03:18:19.207 * Discarding previously cached master state.
1:S 01 Sep 2022 03:18:21.477 * MASTER <-> REPLICA sync: receiving 7953931 bytes from master to disk
1:S 01 Sep 2022 03:18:22.442 * MASTER <-> REPLICA sync: Flushing old data
1:S 01 Sep 2022 03:18:22.493 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 01 Sep 2022 03:18:22.503 * Loading RDB produced by version 6.2.7
1:S 01 Sep 2022 03:18:22.503 * RDB age 3 seconds
1:S 01 Sep 2022 03:18:22.503 * RDB memory usage when created 32.30 Mb
1:S 01 Sep 2022 03:18:22.754 # Done loading RDB, keys loaded: 284065, keys expired: 0.
1:S 01 Sep 2022 03:18:22.754 * MASTER <-> REPLICA sync: Finished with success

再次查看集群状态,6831 已变成从节点

6831已变成从节点

小伙伴也可以自己动手试试查询一下数据。

手动故障转移

之前我们对集群中的主从节点切换,操作步骤多,容易出错,下面看一下方便的做法.

在从节点上执行

1
redis-cli -c -h x.x.x.x -p 端口 cluster failover

这样当前的从节点就会升级为主节点,原来主节点就会降级为从节点。非常方便。小伙伴们可以自行查看对应的日志。


如何学以致用,在哪些场景中应用Redis

《Go语言+Redis实战课》

Go语言+Redis实战课-课程大纲 《Go语言+Redis实战课》课程+优惠券合并照片
添加微信 公众号更多内容
wechat gzh