从某些网络成功与 Ubuntu 交换密钥后无法完成 SSH 连接

从某些网络成功与 Ubuntu 交换密钥后无法完成 SSH 连接

从一个国家飞往另一个国家后,我现在无法 ssh 到我的几台 Digital Ocean Ubuntu 服务器。但是,我仍然可以通过控制台登录,并从一个机器 ssh 到另一个机器(它们都在同一个物理数据中心)。

当使用 -vvvv 运行 ssh 并使用它运行 time 命令时,最后的调试消息是:

debug2: channel 0: open confirm rwindow 0 rmax 32768
Write failed: Broken pipe

1分37秒后超时。

以下是 ssh 密钥认证成功时的调试日志:

debug1: Authentication succeeded (publickey).
Authenticated to 128.199.170.168 ([128.199.170.168]:22).
debug1: channel 0: new [client-session]
debug3: ssh_session2_open: channel_new: 0
debug2: channel 0: send open
debug1: Requesting [email protected]
debug1: Entering interactive session.
debug2: callback start
debug2: fd 3 setting TCP_NODELAY
debug3: packet_set_tos: set IP_TOS 0x10
debug2: client_session2_setup: id 0
debug2: channel 0: request pty-req confirm 1
debug1: Sending environment.
debug3: Ignored env TERM_PROGRAM
debug3: Ignored env SHELL
debug3: Ignored env TERM
debug3: Ignored env TMPDIR
debug3: Ignored env Apple_PubSub_Socket_Render
debug3: Ignored env TERM_PROGRAM_VERSION
debug3: Ignored env TERM_SESSION_ID
debug3: Ignored env USER
debug3: Ignored env SSH_AUTH_SOCK
debug3: Ignored env __CF_USER_TEXT_ENCODING
debug3: Ignored env PATH
debug3: Ignored env MARKPATH
debug3: Ignored env PWD
debug1: Sending env LANG = en_US.UTF-8
debug2: channel 0: request env confirm 0
debug3: Ignored env XPC_FLAGS
debug3: Ignored env PS1
debug3: Ignored env XPC_SERVICE_NAME
debug3: Ignored env SHLVL
debug3: Ignored env HOME
debug3: Ignored env GREP_OPTIONS
debug3: Ignored env LOGNAME
debug3: Ignored env SCALA_HOME
debug3: Ignored env SECURITYSESSIONID
debug3: Ignored env _
debug2: channel 0: request shell confirm 1
debug2: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768
Write failed: Broken pipe

连接速度不是特别慢,我的 shell 是 bash(我仍然可以通过控制台和其他网络 ssh 登录)。由于我看到公钥认证正在进行,所以似乎没有任何东西阻止 ssh 连接。

我不知道写入的哪个管道坏了。FWIW 我是从 OSX 连接的,但在飞往美国之前没有遇到任何问题。


auth.log尝试登录时显示的内容如下:

May 17 12:28:01 db1 CRON[24931]: pam_unix(cron:session): session opened for user root by (uid=0)
May 17 12:28:01 db1 CRON[24931]: pam_unix(cron:session): session closed for user root
May 17 12:28:02 db1 sshd[24955]: error: Could not load host key: /etc/ssh/ssh_host_ed25519_key
May 17 12:28:04 db1 sshd[24955]: Accepted publickey for tomo from 24.210.28.151 port 63202 ssh2: DSA 3a:[redacted]
May 17 12:28:04 db1 sshd[24955]: pam_unix(sshd:session): session opened for user tomo by (uid=0)

连接尝试期间的 Tcpdump 捕获端口 22 的流量:

    $ sudo tcpdump -i en0 port 22
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:00:40.917870 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [S], seq 3430788632, win 65535, options [mss 1460,nop,wscale 5,nop,nop,TS val 1286503697 ecr 0,sackOK,eol], length 0
19:00:41.211348 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [S.], seq 4135716624, ack 3430788633, win 28960, options [mss 1460,sackOK,TS val 898678531 ecr 1286503697,nop,wscale 8], length 0
19:00:41.211415 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 1, win 4117, options [nop,nop,TS val 1286503989 ecr 898678531], length 0
19:00:41.215051 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1:22, ack 1, win 4117, options [nop,nop,TS val 1286503992 ecr 898678531], length 21
19:00:41.484824 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], ack 22, win 114, options [nop,nop,TS val 898678606 ecr 1286503992], length 0
19:00:41.488532 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 1:42, ack 22, win 114, options [nop,nop,TS val 898678609 ecr 1286503992], length 41
19:00:41.488616 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 42, win 4116, options [nop,nop,TS val 1286504260 ecr 898678609], length 0
19:00:41.490182 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], seq 22:1470, ack 42, win 4116, options [nop,nop,TS val 1286504261 ecr 898678609], length 1448
19:00:41.490183 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1470:1614, ack 42, win 4116, options [nop,nop,TS val 1286504261 ecr 898678609], length 144
19:00:41.491254 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], seq 42:1490, ack 22, win 114, options [nop,nop,TS val 898678609 ecr 1286503992], length 1448
19:00:41.592287 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 1490, win 4096, options [nop,nop,TS val 1286504362 ecr 898678609], length 0
19:00:41.760341 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 1490:1674, ack 22, win 114, options [nop,nop,TS val 898678676 ecr 1286504260], length 184
19:00:41.760401 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 1674, win 4090, options [nop,nop,TS val 1286504527 ecr 898678676], length 0
19:00:41.762375 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], ack 1614, win 136, options [nop,nop,TS val 898678676 ecr 1286504261], length 0
19:00:41.762409 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1614:1638, ack 1674, win 4096, options [nop,nop,TS val 1286504529 ecr 898678676], length 24
19:00:42.027042 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 1674:1826, ack 1638, win 136, options [nop,nop,TS val 898678743 ecr 1286504529], length 152
19:00:42.027103 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 1826, win 4091, options [nop,nop,TS val 1286504789 ecr 898678743], length 0
19:00:42.028104 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1638:1782, ack 1826, win 4096, options [nop,nop,TS val 1286504790 ecr 898678743], length 144
19:00:42.300304 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 1826:2546, ack 1782, win 148, options [nop,nop,TS val 898678812 ecr 1286504790], length 720
19:00:42.300357 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2546, win 4073, options [nop,nop,TS val 1286505053 ecr 898678812], length 0
19:00:42.302441 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1782:1798, ack 2546, win 4096, options [nop,nop,TS val 1286505055 ecr 898678812], length 16
19:00:42.600776 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], ack 1798, win 148, options [nop,nop,TS val 898678888 ecr 1286505055], length 0
19:00:42.600843 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1798:1850, ack 2546, win 4096, options [nop,nop,TS val 1286505349 ecr 898678888], length 52
19:00:42.857852 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], ack 1850, win 148, options [nop,nop,TS val 898678952 ecr 1286505349], length 0
19:00:42.858552 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 2546:2598, ack 1850, win 148, options [nop,nop,TS val 898678952 ecr 1286505349], length 52
19:00:42.858584 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2598, win 4094, options [nop,nop,TS val 1286505604 ecr 898678952], length 0
19:00:42.859131 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1850:1918, ack 2598, win 4096, options [nop,nop,TS val 1286505605 ecr 898678952], length 68
19:00:43.124310 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 2598:2650, ack 1918, win 148, options [nop,nop,TS val 898679019 ecr 1286505605], length 52
19:00:43.124374 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2650, win 4094, options [nop,nop,TS val 1286505867 ecr 898679019], length 0
19:00:43.124473 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1918:2434, ack 2650, win 4096, options [nop,nop,TS val 1286505867 ecr 898679019], length 516
19:00:43.394690 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 2650:2702, ack 2434, win 159, options [nop,nop,TS val 898679086 ecr 1286505867], length 52
19:00:43.394774 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2702, win 4094, options [nop,nop,TS val 1286506134 ecr 898679086], length 0
19:01:04.685580 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2434:2582, ack 2702, win 4096, options [nop,nop,TS val 1286527239 ecr 898679086], length 148
19:01:04.966270 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 2702:2738, ack 2582, win 170, options [nop,nop,TS val 898684479 ecr 1286527239], length 36
19:01:04.966378 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2738, win 4094, options [nop,nop,TS val 1286527514 ecr 898684479], length 0
19:01:04.967018 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2582:2702, ack 2738, win 4096, options [nop,nop,TS val 1286527514 ecr 898684479], length 120
19:01:05.269214 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], ack 2702, win 170, options [nop,nop,TS val 898684555 ecr 1286527514], length 0
19:01:06.027067 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 2738:2790, ack 2702, win 170, options [nop,nop,TS val 898684744 ecr 1286527514], length 52
19:01:06.027144 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2790, win 4094, options [nop,nop,TS val 1286528563 ecr 898684744], length 0
19:01:06.027497 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286528563 ecr 898684744], length 460
19:01:06.603432 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286529135 ecr 898684744], length 460
19:01:07.552730 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286530077 ecr 898684744], length 460
19:01:09.250116 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286531762 ecr 898684744], length 460
19:01:12.442790 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286534930 ecr 898684744], length 460
19:01:18.634929 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286541067 ecr 898684744], length 460
19:01:24.068621 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286546451 ecr 898684744], length 460
19:01:34.714519 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286557019 ecr 898684744], length 460
19:01:45.384050 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286567587 ecr 898684744], length 460
19:01:56.051835 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286578155 ecr 898684744], length 460
19:02:06.715163 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286588723 ecr 898684744], length 460
19:02:17.355823 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286599291 ecr 898684744], length 460
19:02:28.042962 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286609859 ecr 898684744], length 460
19:02:38.690971 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [R.], seq 3162, ack 2790, win 4096, length 0

我还尝试过其他一些方法:

  • 减少服务器上的 mtu,可能是 pmtu 出现故障:sudo ip link set mtu 1280 dev eth0
  • 在 OS X 中将我的 wifi 接口的 mtu 降低到 1280
  • 将 ServerAliveInterval 降低到 30,此时连接仍然会超时,但不会出现管道断裂的情况
  • 使用“cat”而不是“bash”或 bash 运行 ssh 但没有加载配置文件/rc
  • 手动设置 OS X wifi 接口 IP 地址而不是 dhcp

答案1

在数据包跟踪中,我们看到在流程早期,最大尺寸的数据包在两个方向上进行交换。这并没有造成任何问题,因此没有任何迹象表明存在 MTU 问题。

稍后在连接过程中,我们看到从客户端发送到服务器的相对序列号为 2702:3162 的数据包从未收到来自服务器的 ACK。

我立即想到,这个数据包丢失是由中间盒故障(即 NAT、防火墙或类似设备)引起的。

我听说过 NAT 盒无法处理 TCP 连接期间 TOS 的更改。您的问题确实发生在客户端指示 TOS 已更改之后。但是,由于 tcpdump 不显示 TOS,因此我不能肯定这是否是问题发生的确切位置。

为了进行测试,您可以尝试使用-o ProxyCommand='nc %h %p'这样一种方式,即 ssh 客户端不直接控制 TCP 连接。您也可以尝试该IPQoS选项。如果 TOS 的更改是问题所在,则指定-o IPQoS=cs0-o IPQoS=0应该有效,但任何其他设置都会失败。这是因为 ssh 在身份验证期间使用 0 作为 QoS,然后在身份验证后切换到所选的 QoS。通过将 QoS 选择为 0,QoS 值不会发生任何变化,从而不会混淆中间件。

答案2

如果其他人遇到这种情况,我在使用 TP-Link Archer VR2600 路由器/调制解调器(带固件1.4.0 0.8.0 v0050.0 Build 160518 Rel.50944n)时也遇到过类似的问题。

按照@kasperd的建议,使用 运行-o IPQoS=0,提示我的路由器存在某种 QoS 问题。我在路由器设置中启用了我能找到的最接近的功能(先进的带宽控制将最大带宽设置为略低于我的线路上可用的带宽),假设在这种情况下路由器可能会开始关注相关标志。

这似乎有效,我的连接现在已接通。切换此选项可以可靠地控制我是否能接通。

答案3

您有用户 ssh 配置(~/.ssh/config)吗?

如果没有,请创建一个并尝试添加以下几行:

ServerAliveInterval 120 #ping the server every 120s
TCPKeepAlive no #do not set SO_KEEPALIVE on socket

答案4

遗憾的是,我在这里没有足够的声誉来投票或评论 Sam Mason 的上述回复,但我只想公开 +1 他所说的话。我也有 VR2600,我也有过同样的经历:

  1. 连接(ssh、sftp 等)已建立,但随后似乎挂了
  2. tshark 显示 TCP 虚假重传
  3. 从客户端设置 -o IPQoS=0(单独设置)没有任何作用
  4. 启用路由器上的“高级”->“带宽控制”设置(之前已禁用),具有尽可能高的限制(实际上无限制),似乎可以修复路由器,使其注意 IPQoS 标志
  5. tshark 不再显示 TCP 虚假重传,并且连接不再挂起(ssh、sftp 等客户端现在可以与 VR2600 路由器后面的服务器一起运行)。

这似乎表明 VR2600 路由器中存在一个重大错误。不幸的是(截至撰写本文时),我使用的是最新固件(1.4.0 0.8.0 v0050.0 Build 160518 Rel.50944n,与 Sam 相同),并且此路由器似乎与 DD-WRT 不兼容/未通过 DD-WRT 测试。

然而,除了上面讨论的内容之外,我还要补充一点:

  1. 执行完步骤 1 至 5 后,即使不指定“-o IPQoS=0”,我现在也可以成功连接

换句话说:

只需在路由器选项中打开“高级”->“带宽控制”选项(即使上限最高)似乎就足以让此路由器 NAT 按照预期运行。如果禁用带宽控制,则会出现 OP 中描述的问题(@malasa 对此进行了详细描述)。

不清楚解决方法是简单地启用此选项,还是也需要使用 -o 选项至少连接一次。无论如何,我可以确认,在启用此选项后,如果我禁用高级->带宽控制选项,那么我的 ssh/sftp/etc 就和以前一样坏了。如果我使能够启用高级->带宽控制选项后,一切似乎又按预期运行了。而且(启用此选项后)路由器重启后一切似乎也运行正常。

所以从我的角度来看,这是一个非常好的解决方法/修复,不需要客户端更改或维护(回答@leonardoborges 的担忧)

相关内容