Docker 搭建 Kafka 集群

大纲

前言

教程说明

本文将使用 Docker 与 Docker-Compose 在单台机器上快速安装 Kafka 集群。

温馨提示

  • 本文的 Kafka 集群是依赖于 Zookeeper 的,因此需要将 Zookeeper 集群提前搭建起来。
  • 值得一提的是,从 Kafka 2.8.0 版本开始,Kafka 自身实现了 Raft 分布式一致性机制,这意味着 Kafka 集群是可以脱离 ZooKeeper 独立运行的。

镜像地址

ZooKeeper 集群搭建

集群规划

节点名称节点 ID 监听端口映射端口版本说明
zookeeper011218121813.8.4
zookeeper022218121823.8.4
zookeeper033218121833.8.4

准备工作

  • 在宿主机中创建 ZooKeeper 节点一用于持久化数据的目录
1
2
sudo mkdir -p /usr/local/zookeeper/zookeeper01/data
sudo mkdir -p /usr/local/zookeeper/zookeeper01/datalog
  • 在宿主机中创建 ZooKeeper 节点二用于持久化数据的目录
1
2
sudo mkdir -p /usr/local/zookeeper/zookeeper02/data
sudo mkdir -p /usr/local/zookeeper/zookeeper02/datalog
  • 在宿主机中创建 ZooKeeper 节点三用于持久化数据的目录
1
2
sudo mkdir -p /usr/local/zookeeper/zookeeper03/data
sudo mkdir -p /usr/local/zookeeper/zookeeper03/datalog

数据卷使用说明

  • ZooKeeper 官方镜像在 /data/datalog 处配置了数据卷,分别用于保存 Zookeeper 内存中的数据库快照和数据库更新的事务日志。
  • 请注意存放事务日志数据的磁盘位置,因为专用的事务日志磁盘是保持良好性能的关键,将日志存放在繁忙的磁盘上会对性能产生不利影响。
  • 若希望指定 ZooKeeper 的配置文件,可以将宿主机中的 ZooKeeper 配置文件 zoo.cfg 挂载到 ZooKeeper 容器内的 /conf/zoo.cfg 位置。

创建容器

  • 创建 docker-compose.yml 配置文件,并写入以下内容:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
version: '3.5'

services:
zookeeper01:
image: zookeeper:3.8.4
container_name: zookeeper01
restart: always
hostname: zookeeper01
ports:
- 2181:2181
environment:
TZ: Asia/Shanghai
ZOO_MY_ID: 1
ZOO_PORT: 2181
ZOO_4LW_COMMANDS_WHITELIST: ruok
ZOO_SERVERS: server.1=zookeeper01:2888:3888;2181 server.2=zookeeper02:2888:3888;2181 server.3=zookeeper03:2888:3888;2181
healthcheck:
test: ["CMD", "sh", "-c", "echo ruok | nc localhost 2181 | grep imok"]
interval: 30s
timeout: 10s
retries: 5
start_period: 20s
volumes:
- /usr/local/zookeeper/zookeeper01/data:/data
- /usr/local/zookeeper/zookeeper01/datalog:/datalog
networks:
- distributed-network

zookeeper02:
image: zookeeper:3.8.4
container_name: zookeeper02
restart: always
hostname: zookeeper02
ports:
- 2182:2181
environment:
TZ: Asia/Shanghai
ZOO_MY_ID: 2
ZOO_PORT: 2181
ZOO_4LW_COMMANDS_WHITELIST: ruok
ZOO_SERVERS: server.1=zookeeper01:2888:3888;2181 server.2=zookeeper02:2888:3888;2181 server.3=zookeeper03:2888:3888;2181
healthcheck:
test: ["CMD", "sh", "-c", "echo ruok | nc localhost 2181 | grep imok"]
interval: 30s
timeout: 10s
retries: 5
start_period: 20s
volumes:
- /usr/local/zookeeper/zookeeper02/data:/data
- /usr/local/zookeeper/zookeeper02/datalog:/datalog
networks:
- distributed-network

zookeeper03:
image: zookeeper:3.8.4
container_name: zookeeper03
restart: always
hostname: zookeeper03
ports:
- 2183:2181
environment:
TZ: Asia/Shanghai
ZOO_MY_ID: 3
ZOO_PORT: 2181
ZOO_4LW_COMMANDS_WHITELIST: ruok
ZOO_SERVERS: server.1=zookeeper01:2888:3888;2181 server.2=zookeeper02:2888:3888;2181 server.3=zookeeper03:2888:3888;2181
healthcheck:
test: ["CMD", "sh", "-c", "echo ruok | nc localhost 2181 | grep imok"]
interval: 30s
timeout: 10s
retries: 5
start_period: 20s
volumes:
- /usr/local/zookeeper/zookeeper03/data:/data
- /usr/local/zookeeper/zookeeper03/datalog:/datalog
networks:
- distributed-network

networks:
distributed-network:
driver: bridge

配置说明

  • ZOO_MY_ID:用于指定 ZooKeeper 节点的 ID,它在集群中必须是唯一,并且其值应在 1 到 255 之间。另外,ZOO_MY_ID: 1 对应 server.1=zookeeper01:2888:3888;2181 中的 server.1
  • ZOO_4LW_COMMANDS_WHITELIST: ruok:让 ZooKeeper 启用 ruok "四字命令"(4LW,Four Letter Words),用于简单的健康检查。在默认情况下,部分 ZooKeeper 配置可能禁用了这些命令,常用于监控的命令有 mntrconfruokstat
  • server.1=zookeeper01:2888:3888;2181 中的 zookeeper01 是在 Docker Compose 中定义的 ZooKeeper 服务的名称,用作其他服务(如 Kafka)访问 ZooKeeper 的主机名,Docker Compose 会自动将这个名称解析为 ZooKeeper 容器的 IP 地址。
  • 创建并启动 ZooKeeper 容器
1
sudo docker-compose up -d

测试容器

查看容器状态

  • 查看所有 ZooKeeper 容器的运行状态
1
sudo docker ps -a
1
2
3
82d40df530ec   zookeeper:3.8.4   "/docker-entrypoint.…"   About a minute ago   Up About a minute (healthy)   2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp, :::2181->2181/tcp, 8080/tcp   zookeeper01
4ae126d4ea36 zookeeper:3.8.4 "/docker-entrypoint.…" About a minute ago Up About a minute (healthy) 2888/tcp, 3888/tcp, 8080/tcp, 0.0.0.0:2182->2181/tcp, :::2182->2181/tcp zookeeper02
c70d199dd8ce zookeeper:3.8.4 "/docker-entrypoint.…" About a minute ago Up About a minute (healthy) 2888/tcp, 3888/tcp, 8080/tcp, 0.0.0.0:2183->2181/tcp, :::2183->2181/tcp zookeeper03
  • 若 ZooKeeper 容器启动失败,可以通过以下命令查看容器的启动日志来排查问题
1
sudo docker logs -f --tail 100 zookeeper01

查看集群状态

  • 查看 ZooKeeper 节点一的集群状态
1
sudo docker exec -it zookeeper01 zkServer.sh status
1
2
3
4
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
  • 查看 ZooKeeper 节点二的集群状态
1
sudo docker exec -it zookeeper02 zkServer.sh status
1
2
3
4
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
  • 查看 ZooKeeper 节点三的集群状态
1
sudo docker exec -it zookeeper03 zkServer.sh status
1
2
3
4
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader

数据同步测试

  • 使用客户端连接 ZooKeeper 节点一,并创建目录
1
2
# 客户端连接节点一
sudo docker exec -it zookeeper01 zkCli.sh -server localhost:2181
1
2
3
4
5
6
7
# 创建目录
[zk: localhost:2181 (CONNECTED) 0] create /test ""
Created /test

# 查看目录信息
[zk: localhost:2181 (CONNECTED) 1] ls /
[test, zookeeper]
  • 使用客户端连接其他任意节点(比如节点二),可以发现 test 目录会同步创建,即在任何一个集群节点进行操作,其他集群节点也会同步更新
1
2
# 客户端连接节点二
sudo docker exec -it zookeeper02 zkCli.sh -server localhost:2181
1
2
3
# 查看目录信息
[zk: localhost:2181 (CONNECTED) 0] ls /
[test, zookeeper]

集群重新选举测试

  • 关闭 Leader 节点(比如节点三)
1
sudo docker stop zookeeper03
  • 等待超时时间到了之后,重新查看其他节点的状态(比如节点一、节点二),会发现节点二被选举为新的 Leader。
1
sudo docker exec -it zookeeper01 zkServer.sh status
1
2
3
4
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower

1
sudo docker exec -it zookeeper02 zkServer.sh status
1
2
3
4
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader

Kafka 集群搭建

集群规划

节点名称节点 ID 监听端口映射端口版本说明
kafka011909390933.9.0
kafka022909490943.9.0
kafka033909590953.9.0

准备工作

  • 在宿主机中创建 Kafka 节点一用于持久化的目录
1
sudo mkdir -p /usr/local/kafka/kafka01
  • 在宿主机中创建 Kafka 节点二用于持久化的目录
1
sudo mkdir -p /usr/local/kafka/kafka02
  • 在宿主机中创建 Kafka 节点三用于持久化的目录
1
sudo mkdir -p /usr/local/kafka/kafka03
  • 在宿主机中对 Kafka 持久化的目录进行授权
1
sudo chmod -R 777 /usr/local/kafka

特别注意

由于 Bitnami 平台维护的 Kafka 容器是一个 non-root 容器,因此挂载的文件和目录必须具有 UID 1001 的适当访问权限,否则 Kafka 容器将无法正常启动。

创建容器

  • 在上面用于创建 ZooKeeper 集群的 docker-compose.yml 配置文件中,添加 Kafka 集群的配置信息,完整的配置文件内容如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
version: '3.5'

services:
zookeeper01:
image: zookeeper:3.8.4
container_name: zookeeper01
restart: always
hostname: zookeeper01
ports:
- 2181:2181
environment:
TZ: Asia/Shanghai
ZOO_MY_ID: 1
ZOO_PORT: 2181
ZOO_4LW_COMMANDS_WHITELIST: ruok
ZOO_SERVERS: server.1=zookeeper01:2888:3888;2181 server.2=zookeeper02:2888:3888;2181 server.3=zookeeper03:2888:3888;2181
healthcheck:
test: ["CMD", "sh", "-c", "echo ruok | nc localhost 2181 | grep imok"]
interval: 30s
timeout: 10s
retries: 5
start_period: 20s
volumes:
- /usr/local/zookeeper/zookeeper01/data:/data
- /usr/local/zookeeper/zookeeper01/datalog:/datalog
networks:
- distributed-network

zookeeper02:
image: zookeeper:3.8.4
container_name: zookeeper02
restart: always
hostname: zookeeper02
ports:
- 2182:2181
environment:
TZ: Asia/Shanghai
ZOO_MY_ID: 2
ZOO_PORT: 2181
ZOO_4LW_COMMANDS_WHITELIST: ruok
ZOO_SERVERS: server.1=zookeeper01:2888:3888;2181 server.2=zookeeper02:2888:3888;2181 server.3=zookeeper03:2888:3888;2181
healthcheck:
test: ["CMD", "sh", "-c", "echo ruok | nc localhost 2181 | grep imok"]
interval: 30s
timeout: 10s
retries: 5
start_period: 20s
volumes:
- /usr/local/zookeeper/zookeeper02/data:/data
- /usr/local/zookeeper/zookeeper02/datalog:/datalog
networks:
- distributed-network

zookeeper03:
image: zookeeper:3.8.4
container_name: zookeeper03
restart: always
hostname: zookeeper03
ports:
- 2183:2181
environment:
TZ: Asia/Shanghai
ZOO_MY_ID: 3
ZOO_PORT: 2181
ZOO_4LW_COMMANDS_WHITELIST: ruok
ZOO_SERVERS: server.1=zookeeper01:2888:3888;2181 server.2=zookeeper02:2888:3888;2181 server.3=zookeeper03:2888:3888;2181
healthcheck:
test: ["CMD", "sh", "-c", "echo ruok | nc localhost 2181 | grep imok"]
interval: 30s
timeout: 10s
retries: 5
start_period: 20s
volumes:
- /usr/local/zookeeper/zookeeper03/data:/data
- /usr/local/zookeeper/zookeeper03/datalog:/datalog
networks:
- distributed-network

kafka01:
image: bitnami/kafka:3.9.0
container_name: kafka01
restart: always
hostname: kafka01
ports:
- 9093:9093
environment:
TZ: Asia/Shanghai
KAFKA_CFG_NODE_ID: 1
KAFKA_CFG_LISTENERS: PLAINTEXT://:9093
KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://192.168.56.112:9093
KAFKA_CFG_ZOOKEEPER_CONNECT: zookeeper01:2181,zookeeper02:2181,zookeeper03:2181/kafka
ALLOW_PLAINTEXT_LISTENER: yes
healthcheck:
test: ["CMD", "kafka-topics.sh", "--bootstrap-server", "kafka01:9093,kafka02:9094,kafka03:9095", "--list"]
interval: 30s
timeout: 15s
retries: 5
start_period: 30s
volumes:
- /usr/local/kafka/kafka01:/bitnami/kafka
depends_on:
zookeeper01:
condition: service_healthy
zookeeper02:
condition: service_healthy
zookeeper03:
condition: service_healthy
networks:
- distributed-network

kafka02:
image: bitnami/kafka:3.9.0
container_name: kafka02
restart: always
hostname: kafka02
ports:
- 9094:9094
environment:
TZ: Asia/Shanghai
KAFKA_CFG_NODE_ID: 2
KAFKA_CFG_LISTENERS: PLAINTEXT://:9094
KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://192.168.56.112:9094
KAFKA_CFG_ZOOKEEPER_CONNECT: zookeeper01:2181,zookeeper02:2181,zookeeper03:2181/kafka
ALLOW_PLAINTEXT_LISTENER: yes
healthcheck:
test: ["CMD", "kafka-topics.sh", "--bootstrap-server", "kafka01:9093,kafka02:9094,kafka03:9095", "--list"]
interval: 30s
timeout: 15s
retries: 5
start_period: 30s
volumes:
- /usr/local/kafka/kafka02:/bitnami/kafka
depends_on:
zookeeper01:
condition: service_healthy
zookeeper02:
condition: service_healthy
zookeeper03:
condition: service_healthy
networks:
- distributed-network

kafka03:
image: bitnami/kafka:3.9.0
container_name: kafka03
restart: always
hostname: kafka03
ports:
- 9095:9095
environment:
TZ: Asia/Shanghai
KAFKA_CFG_NODE_ID: 3
KAFKA_CFG_LISTENERS: PLAINTEXT://:9095
KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://192.168.56.112:9095
KAFKA_CFG_ZOOKEEPER_CONNECT: zookeeper01:2181,zookeeper02:2181,zookeeper03:2181/kafka
ALLOW_PLAINTEXT_LISTENER: yes
healthcheck:
test: ["CMD", "kafka-topics.sh", "--bootstrap-server", "kafka01:9093,kafka02:9094,kafka03:9095", "--list"]
interval: 30s
timeout: 15s
retries: 5
start_period: 30s
volumes:
- /usr/local/kafka/kafka03:/bitnami/kafka
depends_on:
zookeeper01:
condition: service_healthy
zookeeper02:
condition: service_healthy
zookeeper03:
condition: service_healthy
networks:
- distributed-network

networks:
distributed-network:
driver: bridge

重要参数说明

  • KAFKA_CFG_NODE_ID: 1:用于指定 Kafka 节点的 ID,它在集群中必须是唯一的。
  • KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://192.168.56.112:9095这里的 192.168.56.112 是宿主机的 IP 地址或者公网 IP 地址。如果配置错误,会导致外部 Kafka 客户端无法正常连接 Docker 容器内的 Kafka 服务器
  • 这里使用的是由 Bitnami 平台维护的 Kafka 镜像,任何以 KAFKA_CFG_ 开头的环境变量将映射到其对应的 Apache Kafka 配置项。例如,使用 KAFKA_CFG_BACKGROUND_THREADS 来设置 background.threads,或者使用 KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE 来配置 auto.create.topics.enable
  • 创建并启动 ZooKeeper 容器
1
sudo docker-compose up -d

特别注意

  • 在停止 Kafka 集群时,一定要等 Kafka 所有节点进程全部停止后,再停止 Zookeeper 集群。因为 Zookeeper 集群当中记录着 Kafka 集群的相关信息,Zookeeper 集群一旦先停止,Kafka 集群就没有办法再获取停止进程的信息,最后只能手动杀死 Kafka 进程了。
  • 若 Kafka 容器需要配置 JMX 服务(为了实现监控),那么在上述的 docker-compose.yml 配置文件中,不能再为 Kafka 容器配置启动时的健康检测(healthcheck),否则健康检测将会无法通过。
  • 这是因为 Kafka 容器一旦配置 JMX 服务,在 Kafka 容器内部不能再执行类似 kafka-topics.sh --bootstrap-server xxx --list 这样的命令,否则会抛出 JMX 端口占用冲突的异常。

测试容器

查看容器状态

  • 查看所有 Kafka 容器的运行状态
1
sudo docker ps -a
1
2
3
4
5
6
94d9168a3273   bitnami/kafka:3.9.0   "/opt/bitnami/script…"   23 minutes ago   Up 23 minutes (healthy)   0.0.0.0:9096->9092/tcp, :::9096->9092/tcp                                  kafka01
1b370f052d87 bitnami/kafka:3.9.0 "/opt/bitnami/script…" 23 minutes ago Up 23 minutes (healthy) 0.0.0.0:9097->9092/tcp, :::9097->9092/tcp kafka02
2f448e10f746 bitnami/kafka:3.9.0 "/opt/bitnami/script…" 23 minutes ago Up 23 minutes (healthy) 0.0.0.0:9098->9092/tcp, :::9098->9092/tcp kafka03
571f83ee486c zookeeper:3.8.4 "/docker-entrypoint.…" 23 minutes ago Up 23 minutes (healthy) 2888/tcp, 3888/tcp, 8080/tcp, 0.0.0.0:2182->2181/tcp, :::2182->2181/tcp zookeeper02
a175680d04b2 zookeeper:3.8.4 "/docker-entrypoint.…" 23 minutes ago Up 23 minutes (healthy) 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp, :::2181->2181/tcp, 8080/tcp zookeeper01
6cd6fa4e293a zookeeper:3.8.4 "/docker-entrypoint.…" 23 minutes ago Up 23 minutes (healthy) 2888/tcp, 3888/tcp, 8080/tcp, 0.0.0.0:2183->2181/tcp, :::2183->2181/tcp zookeeper03
  • 若 Kafka 容器启动失败,可以通过以下命令查看容器的启动日志来排查问题
1
2
# 查看 Kafka 集群的节点一的日志
sudo docker logs -f --tail 100 kafka01

特别注意

  • (1) 在上述的 Docker-Compose 配置文件中,配置了 Kafka 容器的健康检测,因此在执行 docker ps -a 命令查看容器的运行状态后,可能需要等待一段较长的时间,Kafka 容器的健康状态才会显示为 healthy
  • (2) 若 Kafka 容器启动失败,建议先根据容器的启动日志来排查问题,然后销毁容器并清空 ZooKeeper 和 Kafka 所有挂载在宿主机内的数据文件,最后再重新创建并启动 ZooKeeper 和 Kafka 容器。

检查 ZooKeeper 连接

  • 检查 Kafka 集群的各个节点是否可以正常连接 ZooKeeper 集群
1
2
# 使用客户端连接 ZooKeeper 集群的节点一
sudo docker exec -it zookeeper01 zkCli.sh -server localhost:2181
1
2
3
4
5
# 在 ZooKeeper 中查看 Kafka 集群的节点列表
[zk: localhost:2181 (CONNECTED) 0] ls /kafka/brokers/ids

# 断开 ZooKeeper 的客户端连接
[zk: localhost:2181 (CONNECTED) 1] quit
  • 若 Kafka 集群的各个节点可以正常连接 ZooKeeper 集群,那么在 ZooKeeper 中查看 Kafka 集群的节点列表时,会显示 Kafka 集群的所有 Broker ID,如下所示:
1
[1, 2, 3]

测试消息的生产和消费

  • 创建主题
1
2
# 进入 Kafka 集群的节点一的容器
sudo docker exec -it kafka01 /bin/bash
1
2
# 连接 Kafka 集群的节点二,并创建一个主题
kafka-topics.sh --bootstrap-server kafka02:9094 --create --topic test --partitions 3 --replication-factor 3
  • 启动消费者
1
2
# 进入 Kafka 集群的节点二的容器
sudo docker exec -it kafka02 /bin/bash
1
2
# 连接 Kafka 集群的节点三,启动一个命令行消费者
kafka-console-consumer.sh --bootstrap-server kafka03:9095 --topic test --from-beginning
  • 启动生产者
1
2
# 进入 Kafka 集群的节点三的容器
sudo docker exec -it kafka03 /bin/bash
1
2
3
4
5
# 连接 Kafka 集群的节点一,启动一个命令行生产者
kafka-console-producer.sh --broker-list kafka01:9093 --topic test

# 手动发送消息
>hello world
  • 若在命令行消费者中,可以显示接收到的消息,则说明 Kafka 集群的消息生产和消息消费功能都是正常的。

在容器外进行测试

  • 如果希望在 Kafka 容器外(比如宿主机内)进行测试,以此验证 Kafka 外部客户端(比如 Java 客户端)是否可以正常连接容器内的 Kafka 集群,那么可以在宿主机或者其他机器上执行以下命令,其中的 192.168.56.112 是宿主机的 IP 地址。
  • 创建主题:kafka-topics.sh --bootstrap-server 192.168.56.112:9093 --create --topic test --partitions 3 --replication-factor 3
  • 消费消息:kafka-console-consumer.sh --bootstrap-server 192.168.56.112:9094 --topic test
  • 生产消息:kafka-console-producer.sh --bootstrap-server 192.168.56.112:9095 --topic test

参考资料

Kafka 集群搭建

ZooKeeper 集群搭建