我根据这里概述的方法构建了一个在基于 Alpine 的容器中运行的 SSH 隧道服务:https://github.com/cagataygurturk/docker-ssh-tunnel
该服务通过 IdentityFiles 连接,并设置多个 ControlSocket 和隧道。
我正在针对 Amazon Linux 堡垒进行测试,并通过隧道连接到 PostgreSQL 数据库。
SSH 登录和隧道创建正确,并且隧道可以使用,但某处似乎超时。
- 如果隧道(可能是与目标 SSH 服务器的整体连接?)处于空闲状态 5 分钟后再连接,则连接过程会挂起 30 秒才能成功继续。
- 第一次连接之后的隧道连接很快——亚秒级。
- 让隧道/服务器闲置 5 分钟,然后 30 秒的延迟就会回来。
证据如下:
客户端 ssh-config
Host my-bastion
HostName 99.99.99.99
User ec2-user
IdentityFile ~/.ssh/key.pem
Host *
ControlMaster auto
ControlPath ~/.ssh/controlmasters/cp_%r_%h
ControlPersist yes
StrictHostKeyChecking no
ServerAliveCountMax 60
ServerAliveInterval 30
TCPKeepAlive no
ForkAfterAuthentication yes
StdinNull yes
ExitOnForwardFailure yes
IPQoS 0x00
测试工作流程
先前使用 ControlSocket 建立的隧道。
使用 psql 请求进行测试,该请求身份验证失败,但可以执行隧道。
psql在测试过程中通过隧道建立了2个连接。
首次访问时至少需闲置 5 分钟。
# date && time psql "host=localhost port=5430 dbname=xxx user=UUU password=X"
Tue Mar 8 12:10:57 PST 2022
psql: error: FATAL: password authentication failed for user "UUU"
FATAL: password authentication failed for user "UUU"
real 0m32.497s - slow!
SSH 客户端日志 -vv
1st psql request
[2022-03-08 20:10:57] debug1: Connection to port 5430 forwarding to xxx.us-east-1.rds.amazonaws.com port 5432 requested.
[2022-03-08 20:10:57] debug1: channel 3: new [direct-tcpip]
30 sec Delay here
[2022-03-08 20:10:57] debug2: channel 3: open confirm rwindow 2097152 rmax 32768
[2022-03-08 20:11:29] debug2: channel 3: read<=0 rfd 7 len 0
2nd psql request
[2022-03-08 20:11:29] debug1: Connection to port 5430 forwarding to xxx.us-east-1.rds.amazonaws.com port 5432 requested.
[2022-03-08 20:11:29] debug1: channel 4: new [direct-tcpip]
subsecond response on channel 4
[2022-03-08 20:11:29] debug2: channel 4: open confirm rwindow 2097152 rmax 32768
[2022-03-08 20:11:29] debug2: channel 4: read<=0 rfd 8 len 0
1 日后立即访问。
# date && time psql "host=localhost port=5430 dbname=xxx user=UUU password=X"
Tue Mar 8 12:11:41 PST 2022
psql: error: FATAL: password authentication failed for user "UUU"
FATAL: password authentication failed for user "UUU"
real 0m0.874s - fast!
user 0m0.021s
sys 0m0.016s
1st psql request
[2022-03-08 20:11:41] debug1: Connection to port 5430 forwarding to xxx.us-east-1.rds.amazonaws.com port 5432 requested.
[2022-03-08 20:11:41] debug2: fd 7 setting TCP_NODELAY
[2022-03-08 20:11:41] debug2: fd 7 setting O_NONBLOCK
[2022-03-08 20:11:41] debug1: channel 3: new [direct-tcpip]
Subsecond response to request
[2022-03-08 20:11:41] debug2: channel 3: open confirm rwindow 2097152 rmax 32768
[2022-03-08 20:11:42] debug2: channel 3: read<=0 rfd 7 len 0
...
2nd psql request
[2022-03-08 20:11:42] debug1: Connection to port 5430 forwarding to xxx.us-east-1.rds.amazonaws.com port 5432 requested.
[2022-03-08 20:11:42] debug1: channel 4: new [direct-tcpip]
[2022-03-08 20:11:42] debug2: channel 4: open confirm rwindow 2097152 rmax 32768
我搜索过其他有此问题的人,但没有找到有人讨论这个问题。我尝试过以下建议:https://jrs-s.net/2017/07/01/slow-ssh-logins/并设置 IpQos=0x00 来解决任何潜在的路由器问题。
答案1
问题出在我正在隧道连接的 Aurora PostgreSQL Serverless 服务上。Serverless 的默认设置是当 5 分钟不活动时暂停集群。当有新连接时,服务需要 30 秒左右才能重新启动。
因此,5 分钟后连接变慢是因为无服务器服务正在重启,而不是 SSH 问题 :-/