问题 1

问题 1

我有一个最小的云配置,可以在 DigitalOcean 上运行,没有任何问题。我为 SSH 添加了一些强化功能,需要重新启动sshd.socket才能生效:

units:
  - name: sshd.socket
    command: restart

单独添加此单元(没有实际的 sshd 配置更改)会导致在 Hetzner 上尝试使用相同的云配置进行配置时失败:ssh: connect to host xx.xx.xx.xx port 22: Connection refused。但它在 DigitalOcean 上仍能正常连接。

当我移除该装置后,连接到 Hetzner 机器就可以正常工作,但再次添加它时却始终失败。

变量替换

据我所知,这两个平台之间的唯一区别是,在 DigitalOcean 上,变量$public_ipv4$private_ipv4替换为实际的 IP 地址,而在 Hetzner 等裸机安装上并非如此。

来自CoreOS 文档

注意:其他文档中引用的 $private_ipv4 和 $public_ipv4 替换变量仅在 Amazon EC2、Google Compute Engine、OpenStack、Rackspace、DigitalOcean 和 Vagrant 上受支持。

因此我用静态 IP 地址替换变量。我使用公共 IP 地址,因为这是除环回之外唯一可用的接口。

然而,当我提供无需替换这些变量使用公共 IP 地址,那么它也可以正常连接。

检查日志发现一些与名称解析相关的错误:

systemd[1]: Starting etcd2...
etcd2[874]: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://:2379,http://:4001
etcd2[874]: recognized and used environment variable ETCD_DATA_DIR=/var/lib/etcd2
etcd2[874]: recognized and used environment variable ETCD_DISCOVERY=https://discovery.etcd.io/616b3957c5c78e7738207011f9c51841
etcd2[874]: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=http://:2380
etcd2[874]: recognized and used environment variable ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379,http://0.0.0.0:4001
etcd2[874]: recognized and used environment variable ETCD_LISTEN_PEER_URLS=http://:2380
etcd2[874]: recognized and used environment variable ETCD_NAME=39b2a003672546f8a0b648dbc66e8f6f
etcd2[874]: etcd Version: 2.2.0
etcd2[874]: Git SHA: e4561dd
etcd2[874]: Go Version: go1.4.2
etcd2[874]: Go OS/Arch: linux/amd64
etcd2[874]: setting maximum number of CPUs to 1, total number of available CPUs is 12
etcd2[874]: listening for peers on http://:2380
etcd2[874]: listening for client requests on http://0.0.0.0:2379
etcd2[874]: listening for client requests on http://0.0.0.0:4001
etcd2[874]: resolving :2380 to :2380
etcd2[874]: resolving :2380 to :2380
etcd2[874]: error #0: dial tcp: lookup discovery.etcd.io: Temporary failure in name resolution
etcd2[874]: cluster status check: error connecting to https://discovery.etcd.io, retrying in 2s
etcd2[874]: error #0: dial tcp: lookup discovery.etcd.io: Temporary failure in name resolution
etcd2[874]: cluster status check: error connecting to https://discovery.etcd.io, retrying in 4s
etcd2[874]: found self 61dbc8c9c2aca1e8 in the cluster
etcd2[874]: found 1 needed peer(s)

但它们似乎并不致命:systemctl status etcd2.service表明该服务处于活动状态:

core@localhost ~ $ systemctl status etcd2.service
● etcd2.service - etcd2
   Loaded: loaded (/usr/lib64/systemd/system/etcd2.service; disabled; vendor preset: disabled)
  Drop-In: /run/systemd/system/etcd2.service.d
           └─20-cloudinit.conf
   Active: active (running) since Tue 2016-03-22 14:10:33 UTC; 7min ago
 Main PID: 874 (etcd2)
   Memory: 20.3M
      CPU: 1.771s
   CGroup: /system.slice/etcd2.service
           └─874 /usr/bin/etcd2

etcd2[874]: added local member 61dbc8c9c2aca1e8 [http://:2380] to cluster 216c373aaf11ccfa
systemd[1]: Started etcd2.
etcd2[874]: 61dbc8c9c2aca1e8 is starting a new election at term 1
etcd2[874]: 61dbc8c9c2aca1e8 became candidate at term 2
etcd2[874]: 61dbc8c9c2aca1e8 received vote from 61dbc8c9c2aca1e8 at term 2
etcd2[874]: 61dbc8c9c2aca1e8 became leader at term 2
etcd2[874]: raft.node: 61dbc8c9c2aca1e8 elected leader 61dbc8c9c2aca1e8 at term 2
etcd2[874]: published {Name:39b2a003672546f8a0b648dbc66e8f6f ClientURLs:[http://:2379 http://:4001]} to cluster 216c373aaf11ccfa
etcd2[874]: setting up the initial cluster version to 2.2
etcd2[874]: set the initial cluster version to 2.2

连接到其他服务(如 Logstash)的容器失败:the scheme http does not accept registry part: :9200 (or bad hostname?)

云配置

这是一个精简的云配置,但它仍然演示了该问题(已验证)。

#cloud-config

ssh_authorized_keys:
  - "ssh-rsa A valid SSH key here"
write_files:
coreos:
  etcd2:
    # NOTE: replace $discovery_url with a url generated at https://discovery.etcd.io/new?size=X
    discovery: $discovery_url
    listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
    advertise-client-urls: http://my.public.ip.address:2379,http://my.public.ip.address:4001
    initial-advertise-peer-urls: http://my.public.ip.address:2380
    listen-peer-urls: http://my.public.ip.address:2380          # Remove this flag or use localhost and the connection issue goes away
  units:
    - name: etcd2.service
      command: start
    - name: fleet.service
      command: start
    - name: sshd.socket
      command: restart   # Remove this unit and all issues go away (but no SSH hardening in that case)

我注意到的一件事是,当我删除标志时,listen-peer-urls连接问题也会消失,尽管由于同样的原因,logstash 仍然无法启动。

这个文件表示这些标志的默认值是带有 的 URL localhost,但 DigitalOcean 等平台上使用的替换变量的名称似乎表明这应该是对等机器可访问的地址。

当我使用localhost这些标志时我可以连接,但其他问题仍然存在。

问题 1

对于仅具有公共和环回接口(没有私有网络)的裸机,正确的云配置应该是什么?

问题2

这里的 sshd 和 etcd 之间有什么关系,导致了这个失败?

答案1

对于仅具有公共和环回接口(没有私有网络)的裸机,正确的云配置应该是什么?

插入机器的公共 IP 来代替这些变量。

这里的 sshd 和 etcd 之间有什么关系,导致了这个失败?

你能分享一下 sshd 日志吗?为什么它没有启动?

相关内容