这个问题被标记为不适合 StackOverflow,因为它侧重于网络。请告诉我是否有比这里更好的地方。
TLDR;
总之,从与服务 nginx 在同一堆栈上定义的服务容器内:
nslookup nginx # gives the virtual IP of the nginx service
nslookup tasks.nginx # gives the correct IP of an nginx container (10.0.17.20)
ping 10.0.17.20 # this works
ping nginx # doesn't work
curl http://10.0.17.20 # this works
curl http://nginx # doesn't work
但是,curl http://tasks.nginx
确实解决了。在上下文中,节点是 LXC 容器。使用 Digital Ocean VM 时不会发生此行为。
问题
在 Docker Swarm 中,我发现容器内通过服务名称进行的 DNS 解析失败。例如,使用以下堆栈配置:
version: "3.9"
networks:
elk7:
name: elk7
driver: overlay
attachable: true
ipam:
driver: default
config:
- subnet: "10.0.17.0/24"
services:
setup:
...
networks:
- elk7
es01: # placed on manger node 1
...
networks:
- elk7
# ... es02/es03/etc
nginx: # placed on manger node 2
...
networks:
- elk7
管理节点 1
docker network inspect elk7
显示此节点上存在该服务的容器es01
(我假设我应该仅查看该节点的容器?
"9c1a019a5c83c466615819b5401bbb0e58c31f078a96f13ed4af3905c837d565": {
"Name": "elk7_es01.1.meux9ctcmnfwdejiqmxftyeq8",
"EndpointID": "e0b102827e93eb3d0439513778c308eeb1201cd1e8e252f1361692c2f9981cc5",
"MacAddress": "02:42:0a:00:11:11",
"IPv4Address": "10.0.17.17/24",
"IPv6Address": ""
},
并且 IPAM 部分放在这里似乎也很有用:
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.17.0/24",
"Gateway": "10.0.17.1"
}
]
},
管理节点 2
登录到 Nginx 容器(docker container exec -it <container id> bash
),我无法es01
通过服务名称联系该服务,但我可以通过 IP 地址联系它:
root@2d6f42945a18:/# ping es01
PING es01 (10.0.17.16) 56(84) bytes of data.
From 2d6f42945a18 (10.0.17.20) icmp_seq=1 Destination Host Unreachable
From 2d6f42945a18 (10.0.17.20) icmp_seq=2 Destination Host Unreachable
From 2d6f42945a18 (10.0.17.20) icmp_seq=3 Destination Host Unreachable
--- es01 ping statistics ---
5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4090ms
对比
root@2d6f42945a18:/# ping 10.0.17.17
PING 10.0.17.17 (10.0.17.17) 56(84) bytes of data.
64 bytes from 10.0.17.17: icmp_seq=1 ttl=64 time=0.220 ms
64 bytes from 10.0.17.17: icmp_seq=2 ttl=64 time=0.128 ms
--- 10.0.17.17 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1016ms
rtt min/avg/max/mdev = 0.128/0.174/0.220/0.046 ms
当我执行nslookup
或时,dig
我得到了答案 - 因此服务名称es01
似乎可以解析:
root@2d6f42945a18:/# nslookup es01
Server: 127.0.0.11
Address: 127.0.0.11#53
Non-authoritative answer:
Name: es01
Address: 10.0.17.16
============================================================
root@2d6f42945a18:/# dig es01
; <<>> DiG 9.18.24-1-Debian <<>> es01
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17322
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;es01. IN A
;; ANSWER SECTION:
es01. 600 IN A 10.0.17.16
;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11) (UDP)
;; WHEN: Tue Mar 26 07:10:16 UTC 2024
;; MSG SIZE rcvd: 42
该10.0.17.16
IP 地址是(我刚刚了解到的)与 es01 服务关联的虚拟 IP 地址:docker service inspect elk7_es01
显示:
...
"Endpoint": {
"Spec": {
"Mode": "vip"
},
"VirtualIPs": [
{
"NetworkID": "rdtgyz97aahhsrlwz8u2mm8fy",
"Addr": "10.0.17.16/24"
}
]
}
我不确定为什么我无法通过服务名称联系服务任务(即容器),当我能从另一个服务任务(容器)内部解析服务的虚拟 IP。
可能是什么问题?我的 Swarm 节点都是通过 Proxmox 7.0.11 配置的 Ubuntu 20.04 LXC 容器。主机 IP 地址都在范围内10.8.66.0/24
(不确定这是否重要)。
我确实看到 Stack Overflow 上有类似的问题(https://stackoverflow.com/questions/68489435/docker-swarm-failing-to-resolve-dns-by-service-name-with-python-celery-workers-c),然而这个答案对我的情况没有帮助。
另一个问题 (https://stackoverflow.com/questions/50590866/docker-swarm-service 中的错误 ip 地址) 提到您可以通过 明确查看服务任务的 DNS 解析nslookup tasks.es01
,并给出正确的容器 IP 地址:
root@2d6f42945a18:/# nslookup tasks.es01
Server: 127.0.0.11
Address: 127.0.0.11#53
Non-authoritative answer:
Name: tasks.es01
Address: 10.0.17.17