我正在尝试设置一个通过 ssh 进行简单的 ppp 隧道。它在几台机器上运行良好。但在一台机器上,pppd“卡住了”:
> pgrep pppd | xargs ps up
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 4178 0.0 0.1 3020 1088 pts/1 Ds+ 05:28 0:00 /usr/sbin/pppd
任何试图杀死它的尝试(甚至sudo kill -9 4178
)都没有效果,我看不出有任何效果。strace -p 4178
也同样挂起。启动一段时间后,我开始收到dmesg
如下所示的消息。
它是从另一台机器启动的:
ssh -t root@server /usr/sbin/pppd passive noauth
当我对其中一台正常工作的机器执行此操作时,远程端会pppd
向控制台吐出垃圾/二进制数据(正如预期的那样)。当我对发生故障的机器执行此操作时,我没有从 pppd 获得任何输出,但 ssh 会话最终会超时。如果我改为 ssh 到机器,然后/usr/sbin/pppd passive noauth
在单独的步骤中运行,我也可以获得预期的二进制输出。
我现在有几个问题:
- pppd 发生故障的那台机器可能出了什么问题?我甚至不知道从哪里开始查找...
ssh -t root@server /usr/sbin/pppd passive noauth
一步到位ssh root@server
和/usr/sbin/pppd passive noauth
两步到位之间有什么区别?- 为什么即使使用 也无法终止该进程
sudo kill -9
?我知道的唯一方法就是重新启动。
(我曾尝试搜索类似的东西,但没有找到任何结果,所以很抱歉我没有更多的线索)
有任何想法吗?
问题机器在 VMware“硬件”上的 Debian 中运行(正常工作的硬件也是如此),并且在克隆时以及在 Debian lenny(原始)和 squeeze(升级后)上都出现了问题
dmesg
条目:
[ 1198.727248] INFO: task pppd:4178 blocked for more than 120 seconds.
[ 1198.727507] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1198.727904] pppd D ece2dc9c 0 4178 4174 0x00000004
[ 1198.727908] 00000098 00000082 f2503520 ece2dc9c 0000b1e7 00000000 c148d1c0 c148d1c0
[ 1198.727913] f2a06100 f6e071c0 00000000 ece2dc18 f5cd07e0 00000000 ece2d400 ece2dc9d
[ 1198.727918] 00c52300 ece2dcbc f67bfef8 ec98e480 f291cec0 00000000 c10cf5b0 c10dfd21
[ 1198.727923] Call Trace:
[ 1198.727926] [<c10cf5b0>] ? nameidata_to_filp+0x37/0x41
[ 1198.727929] [<c10dfd21>] ? dput+0x21/0xb7
[ 1198.727932] [<c11cfecc>] ? tty_ldisc_ref_wait+0x5f/0x76
[ 1198.727935] [<c104de7a>] ? wake_up_bit+0x5c/0x5c
[ 1198.727938] [<c11cb91b>] ? tty_ioctl+0x85f/0x8ba
[ 1198.727941] [<c10fec18>] ? do_lock_file_wait+0x3d/0xd9
[ 1198.727944] [<c1162c97>] ? _copy_from_user+0x2b/0x102
[ 1198.727946] [<c11cb0bc>] ? tty_check_change+0xb9/0xb9
[ 1198.727949] [<c10dbeb7>] ? do_vfs_ioctl+0x485/0x4c7
[ 1198.727952] [<c10db59a>] ? do_fcntl+0x24f/0x3a2
[ 1198.727954] [<c10dbf3a>] ? sys_ioctl+0x41/0x58
[ 1198.727957] [<c12c6a1f>] ? sysenter_do_call+0x12/0x28
[ 1318.457225] INFO: task sshd:4174 blocked for more than 120 seconds.
[ 1318.457500] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1318.457896] sshd D f25024cc 0 4174 2393 0x00000000
[ 1318.457901] 00000098 00000086 f2a06940 f25024cc 0000b246 00000000 c148d1c0 c148d1c0
[ 1318.457906] f2503520 f6e071c0 00000000 3f056585 0000000f ece2d4bc 3f056585 f2503520
[ 1318.457911] ec98bb38 ec98bbdc 00000000 00000000 00000000 c12c09b5 f2503520 c10327cb
[ 1318.457916] Call Trace:
[ 1318.457926] [<c12c09b5>] ? schedule_hrtimeout_range_clock+0x3c/0xd9
[ 1318.457931] [<c10327cb>] ? try_to_wake_up+0x13f/0x13f
[ 1318.457935] [<c11cfecc>] ? tty_ldisc_ref_wait+0x5f/0x76
[ 1318.457940] [<c104de7a>] ? wake_up_bit+0x5c/0x5c
[ 1318.457943] [<c11c9ad3>] ? tty_poll+0x32/0x5e
[ 1318.457947] [<c10dd4d5>] ? do_select+0x2a1/0x42e
[ 1318.457950] [<c10dcb83>] ? poll_freewait+0x69/0x69
[ 1318.457953] [<c10dcc25>] ? __pollwait+0xa2/0xa2
[ 1318.457955] [<c10dcc25>] ? __pollwait+0xa2/0xa2
[ 1318.457958] [<c10dcc25>] ? __pollwait+0xa2/0xa2
[ 1318.457960] [<c10dcc25>] ? __pollwait+0xa2/0xa2
[ 1318.457963] [<c10dcc25>] ? __pollwait+0xa2/0xa2
[ 1318.457965] [<c10dcc25>] ? __pollwait+0xa2/0xa2
[ 1318.457968] [<c10dcc25>] ? __pollwait+0xa2/0xa2
[ 1318.457971] [<c10429c2>] ? lock_timer_base+0x19/0x35
[ 1318.457974] [<c1042eb5>] ? __mod_timer+0x10c/0x116
[ 1318.457977] [<c1042f89>] ? mod_timer+0x69/0x6e
[ 1318.457981] [<c121325d>] ? sk_reset_timer+0xc/0x16
[ 1318.457984] [<c1252f57>] ? tcp_event_new_data_sent+0x66/0x6b
[ 1318.457987] [<c1255b85>] ? tcp_write_xmit+0x7a7/0x86a
[ 1318.457990] [<c121760d>] ? __alloc_skb+0x50/0xfd
[ 1318.457994] [<c12c12bc>] ? _raw_spin_lock_bh+0x8/0x1e
[ 1318.457996] [<c1212e98>] ? release_sock+0x10/0xc4
[ 1318.457999] [<c124b543>] ? tcp_sendmsg+0x6dd/0x7b7
[ 1318.458003] [<c1162c97>] ? _copy_from_user+0x2b/0x102
[ 1318.458006] [<c10dd7a0>] ? core_sys_select+0x13e/0x1c3
[ 1318.458009] [<c12102a3>] ? sock_aio_write+0xc0/0xd4
[ 1318.458012] [<c10d0655>] ? do_sync_write+0xa0/0xe4
[ 1318.458016] [<c10b141c>] ? handle_mm_fault+0x222/0x238
[ 1318.458019] [<c10f6096>] ? fsnotify+0x1de/0x1f9
[ 1318.458022] [<c10dd9e8>] ? sys_select+0x6e/0x8f
[ 1318.458024] [<c10d105e>] ? sys_write+0x3c/0x63
[ 1318.458028] [<c12c6a1f>] ? sysenter_do_call+0x12/0x28