我最近在相对较新的 Ubuntu 18.04 服务器上配置了 VLAN 接口,该服务器有两个物理接口:eno1 和 eno2。我使用 netplan 实用程序对它们进行了配置。
我的目标很简单:能够从网关(一对运行 HSRP 的 Cisco ASR 1006)进行 SSH。换句话说,我只是尝试在同一个 LAN 内从一台主机到另一台主机进行 SSH。我已经尝试了所有能想到的方法来排除此故障。以下是一些注意事项:
- SSH 从 Ubuntu 的 VLAN 接口到网关运行
- SSH 从 Cisco ASR 运行到服务器物理接口的 IP 地址
- SSH 确实不是从 Cisco ASR 到服务器上 VLAN 接口的 IP 地址
- 从 Cisco ASR 端捕获的数据包显示 SSH 请求已发送到服务器,但服务器没有回复,因此在 ASR 放弃之前 TCP 重新传输了几次。
- 从 Ubuntu 服务器捕获的数据包显示了同样的情况。数据包从尝试 SSH(ASR 1006)的主机通过 TCP 端口 22 进入,但服务器没有响应,因此接下来进行 TCP 重传。
- 我已经重新安装了 open-ssh 服务器,重新启动,重新启动服务,明确指定 sshd 监听 10.255.255.12 等,但无济于事。
- 普通主机(Win 8.1,使用 PuTTY)可以 ssh 进入服务器的 vlan 接口。
问题:Cisco ASR 无法通过 SSH 连接到服务器的 VLAN 接口。
VLAN 999 和 3001 分别包含一个私有 IP 和一个私有 CGN IP:
/etc/netplan/50-cloud-init.yaml 的内容
# This file is generated from information provided by
# the datasource. Changes to it will not persist across an instance.
# To disable cloud-init's network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
ethernets:
eno1:
addresses:
- <my public IP>/28
gateway4: <my gateway>
nameservers:
addresses:
- 8.8.8.8
- 8.8.4.4
eno2:
dhcp4: true
enp10s0f0:
dhcp4: true
enp10s0f1:
dhcp4: true
version: 2
vlans:
vlan.3001:
id: 3001
link: eno2
addresses: [10.255.255.12/29]
vlan.999:
id: 999
link: eno2
addresses: [100.78.32.240/24]
/etc/ssh/sshd_config 的内容
# $OpenBSD: sshd_config,v 1.101 2017/03/14 07:19:07 djm Exp $
# This is the sshd server system-wide configuration file. See
# sshd_config(5) for more information.
# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin
# The strategy used for options in the default sshd_config shipped with
# OpenSSH is to specify options with their default value where
# possible, but leave them commented. Uncommented options override the
# default value.
#Port 22
#AddressFamily any
#ListenAddress 0.0.0.0
ListenAddress <my public ip>
ListenAddress 10.255.255.12
#ListenAddress ::
#HostKey /etc/ssh/ssh_host_rsa_key
#HostKey /etc/ssh/ssh_host_ecdsa_key
#HostKey /etc/ssh/ssh_host_ed25519_key
# Ciphers and keying
#RekeyLimit default none
# Logging
#SyslogFacility AUTH
#LogLevel INFO
# Authentication:
#LoginGraceTime 2m
#PermitRootLogin prohibit-password
#StrictModes yes
#MaxAuthTries 6
#MaxSessions 10
#PubkeyAuthentication yes
# Expect .ssh/authorized_keys2 to be disregarded by default in future.
#AuthorizedKeysFile .ssh/authorized_keys .ssh/authorized_keys2
#AuthorizedPrincipalsFile none
#AuthorizedKeysCommand none
#AuthorizedKeysCommandUser nobody
# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
#HostbasedAuthentication no
# Change to yes if you don't trust ~/.ssh/known_hosts for
# HostbasedAuthentication
#IgnoreUserKnownHosts no
# Don't read the user's ~/.rhosts and ~/.shosts files
#IgnoreRhosts yes
# To disable tunneled clear text passwords, change to no here!
#PasswordAuthentication yes
#PermitEmptyPasswords no
# Change to yes to enable challenge-response passwords (beware issues with
# some PAM modules and threads)
ChallengeResponseAuthentication no
# Kerberos options
#KerberosAuthentication no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes
#KerberosGetAFSToken no
# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes
#GSSAPIStrictAcceptorCheck yes
#GSSAPIKeyExchange no
# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication. Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
UsePAM yes
#AllowAgentForwarding yes
#AllowTcpForwarding yes
#GatewayPorts no
X11Forwarding yes
#X11DisplayOffset 10
#X11UseLocalhost yes
#PermitTTY yes
PrintMotd no
#PrintLastLog yes
#TCPKeepAlive yes
#UseLogin no
#PermitUserEnvironment no
#Compression delayed
#ClientAliveInterval 0
#ClientAliveCountMax 3
#UseDNS no
#PidFile /var/run/sshd.pid
#MaxStartups 10:30:100
#PermitTunnel no
#ChrootDirectory none
#VersionAddendum none
# no default banner path
#Banner none
# Allow client to pass locale environment variables
AcceptEnv LANG LC_*
# override default of no subsystems
Subsystem sftp /usr/lib/openssh/sftp-server
# Example of overriding settings on a per-user basis
#Match User anoncvs
# X11Forwarding no
# AllowTcpForwarding no
# PermitTTY no
# ForceCommand cvs server
PasswordAuthentication yes
sshd 服务显然正在监听私有 IP 接口,但仍然不会回复 ASR 发送到其 VLAN 接口的 ssh 请求。
~$ sudo netstat -tulpn | grep :22
tcp 0 0 <my public IP>:22 0.0.0.0:* LISTEN 3956/sshd
tcp 0 0 10.255.255.12:22 0.0.0.0:* LISTEN 3956/sshd
请注意,这是来自服务器本身(通过 SSH 连接到自身):
$ ssh 10.255.255.12
<my username>@10.255.255.12's password:
请注意,这是来自 Cisco ASR(-l 标志只是要使用的用户名):
ASR1006#ssh -l user <my public IP>
Password:
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-72-generic x86_64)
user@ubuntu-server:~$ exit
logout
[Connection to <my public IP> closed by foreign host]
ASR1006#ssh -l user 100.78.32.240
% Connection timed out; remote host not responding
ASR1006#ssh -l user 10.255.255.12
% Connection timed out; remote host not responding
您可以看到,当通过 SSH 连接到与服务器物理接口关联的公共 IP 地址:eno1 时,连接已建立。
我故障排除的最后一部分涉及使用 PuTTY 连接普通主机 - Windows 8.1,并将其置于 10.255.255.8/29 子网中。Windows 主机可以SSH 进入服务器的 vlan.3001 接口。
这是一个非常独特的问题,我并不看好是否有人能够提供帮助。我可以向思科提交 TAC 案例,因为问题可能出在思科方面,但我想从这里开始。