系统结构:
10.10.1.86
:Kubernetes 主节点10.10.1.87
:Kubernetes 工作节点 1;keepalived
MASTER 节点10.10.1.88
:Kubernetes worker 2节点;keepalived
BACKUP节点10.10.1.90
:VIP,将负载平衡到.87
&.88
;由实现keepalived
。
这个 Kubernetes 集群是一个开发环境,用于测试收集 netflow 日志。
我想要实现的目标是:
- 所有路由器/交换机的netflow日志首先输出到
.90
- 然后使用
keepalived
负载平衡(lb_kind
:)NAT
到.87
&.88
,它们是两个 Kubernetes 工作进程。 - 有
NodePort
服务可以捕获这些流量进入 Kubernetes 集群并执行其余数据解析工作。
- 就像是:
| {OS Network} | {Kubernetes Network}
K8s Worker -> filebeat -> logstash (deployments)
/
<data> -> [VIP] load balance
\
K8s Worker -> filebeat -> logstash (deployments)
- filebeat.yml(已经测试过 filebeat 之后流量都没问题,所以我使用
file
输出来缩小根本原因。)
# cat filebeat.yml
filebeat.inputs:
- type: tcp
max_message_size: 10MiB
host: "0.0.0.0:5100"
- type: udp
max_message_size: 10KiB
host: "0.0.0.0:5150"
#output.logstash:
# hosts: ["10.10.1.87:30044", "10.10.1.88:30044"]
output.file:
path: "/tmp/"
filename: tmp-filebeat.out
Kubernetes
- Master 和 Workers 是我私有环境中的 3 个虚拟机;不是任何 GCP 或 AWS 提供商。
- 版本:
# kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:31:21Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:25:06Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"linux/amd64"}
- 服务
# cat logstash.service.yaml
apiVersion: v1
kind: Service
metadata:
name: logstash-service
spec:
type: NodePort
selector:
app: logstash
ports:
- port: 9514
name: tcp-port
targetPort: 9514
nodePort: 30044
- 一旦数据进入 Kubernetes,一切就正常了。
- 这是 VIP 负载平衡未转发。
Keepalived 配置
!Configuration File for keepalived
global_defs {
router_id proxy1 # `proxy 2` at the other node
}
vrrp_instance VI_1 {
state MASTER # `BACKUP` at the other node
interface ens160
virtual_router_id 41
priority 100 # `50` at the other node
advert_int 1
virtual_ipaddress {
10.10.1.90/23
}
}
virtual_server 10.10.1.90 5100 {
delay_loop 30
lb_algo rr
lb_kind NAT
protocol TCP
persistence_timeout 0
real_server 10.10.1.87 5100 {
weight 1
}
real_server 10.10.1.88 5100 {
weight 1
}
}
virtual_server 10.10.1.90 5150 {
delay_loop 30
lb_algo rr
lb_kind NAT
protocol UDP
persistence_timeout 0
real_server 10.10.1.87 5150 {
weight 1
}
real_server 10.10.1.88 5150 {
weight 1
}
在 Kubernetes 集群设置之前
- 都已
.87
安装.88
,keepalived
并且rr
(RoundRobin)负载平衡运行良好(TCP 和 UDP)。 - 在设置 kubernetes 集群时停止
keepalived
服务(systemctl stop keepalived
),以防万一。
Kubernetes 集群搭建完成后出现问题
- 发现只有MASTER节点
.87
可以获得流量转发,VIP无法转发到BACKUP节点.88
。 - 从 MASTER 转发的数据已被 kubernetes
NodePort
和部署成功捕获。
问题测试者nc
:
nc
:只有持有VIP(MASTER节点)的人才能转发流量,当rr
转发到BACKUP时,只会显示超时。nc -l 5100
也在两台服务器上进行了测试,只有 MASTER 节点获得了结果。
# echo "test" | nc 10.10.1.90 5100
# echo "test" | nc 10.10.1.90 5100
Ncat: Connection timed out.
# echo "test" | nc 10.10.1.90 5100
# echo "test" | nc 10.10.1.90 5100
Ncat: Connection timed out.
一些信息
- 软件包版本
# rpm -qa |grep keepalived
keepalived-1.3.5-19.el7.x86_64
- Kubernetes CNI:
Calico
# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-b656ddcfc-wnkcj 1/1 Running 2 78d
calico-node-vnf4d 1/1 Running 8 78d
calico-node-xgzd5 1/1 Running 1 78d
calico-node-zt25t 1/1 Running 8 78d
coredns-558bd4d5db-n6hnn 1/1 Running 2 78d
coredns-558bd4d5db-zz2rb 1/1 Running 2 78d
etcd-a86.axv.bz 1/1 Running 2 78d
kube-apiserver-a86.axv.bz 1/1 Running 2 78d
kube-controller-manager-a86.axv.bz 1/1 Running 2 78d
kube-proxy-ddwsr 1/1 Running 2 78d
kube-proxy-hs4dx 1/1 Running 3 78d
kube-proxy-qg2nq 1/1 Running 1 78d
kube-scheduler-a86.axv.bz 1/1 Running 2 78d
ipvsadm
( 、的结果相同.87
).88
# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.10.1.90:5100 rr
-> 10.10.1.87:5100 Masq 1 0 0
-> 10.10.1.88:5100 Masq 1 0 0
UDP 10.10.1.90:5150 rr
-> 10.10.1.87:5150 Masq 1 0 0
-> 10.10.1.88:5150 Masq 1 0 0
- Selinux 始终
Permissive
- 如果停止
firewalld
,仍然不起作用。 sysctl
不同之处:
# before:
net.ipv4.conf.all.accept_redirects = 1
net.ipv4.conf.all.forwarding = 0
net.ipv4.conf.all.route_localnet = 0
net.ipv4.conf.default.forwarding = 0
net.ipv4.conf.lo.forwarding = 0
net.ipv4.ip_forward = 0
# after
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.forwarding = 1
net.ipv4.conf.all.route_localnet = 1
net.ipv4.conf.default.forwarding = 1
net.ipv4.conf.lo.forwarding = 1
net.ipv4.ip_forward = 1
不确定现在是否可以做进一步检查,请指教,谢谢!