某些早晨,Debian 虚拟机会占用大量 CPU,变得无响应

某些早晨,Debian 虚拟机会占用大量 CPU,变得无响应

某些早晨,通常是 6:30 到 8:30 之间,我的虚拟机会锁定,甚至会对 VMWare 服务器主机本身造成附带损害。发生这种情况时,我无法通过 SSH 进入虚拟机或主机。

我相信我已经将范围缩小到 cron.daily 中的 mlocate 作业。但当然不应该有什么错误的有了那个 cron 作业,所以我手头上有一个更大的问题,我无法识别。就其价值而言,这台机器的 RAM 数量非常有限,只有 384MB。也许不现实,但这超出了 Debian 的要求,而且我知道系统在出现此问题时不会做太多其他事情。

以下是我在消息日志中收到的一些内容:

Jul 18 08:30:02 core kernel: [607607.955528] updatedb.mloc D ddadc12f     0  3274   3270
Jul 18 08:30:02 core kernel: [607607.955615]        d746ece0 00000082 0011caef ddadc12f 000221d2 d746ee6c c1309fc0 00000000
Jul 18 08:30:02 core kernel: [607607.955692]        d60c3b4c 01142a38 07a53f31 00000000 01142a38 d60c3b4c 01142a38 c6ae3d3c
Jul 18 08:30:02 core kernel: [607607.955709]        c1309fc0 00f4f000 c6ae3d3c c1300e28 c02b9048 c6ae3d34 00000000 c0190d2e
Jul 18 08:30:02 core kernel: [607607.955723] Call Trace:
Jul 18 08:30:02 core kernel: [607607.956038]  [<c02b9048>] io_schedule+0x49/0x80
Jul 18 08:30:02 core kernel: [607607.956472]  [<c0190d2e>] sync_buffer+0x30/0x33
Jul 18 08:30:02 core kernel: [607607.956511]  [<c02b9236>] __wait_on_bit+0x33/0x58
Jul 18 08:30:02 core kernel: [607607.956515]  [<c0190cfe>] sync_buffer+0x0/0x33
Jul 18 08:30:02 core kernel: [607607.956524]  [<c0190cfe>] sync_buffer+0x0/0x33
Jul 18 08:30:02 core kernel: [607607.956527]  [<c02b92ba>] out_of_line_wait_on_bit+0x5f/0x67
Jul 18 08:30:02 core kernel: [607607.956533]  [<c0131a91>] wake_bit_function+0x0/0x3c
Jul 18 08:30:02 core kernel: [607607.956583]  [<c0190cca>] __wait_on_buffer+0x16/0x18
Jul 18 08:30:02 core kernel: [607607.956593]  [<d89b153d>] ext3_find_entry+0x37a/0x515 [ext3]
Jul 18 08:30:02 core kernel: [607607.957163]  [<c01bae24>] security_inode_alloc+0x16/0x17
Jul 18 08:30:02 core kernel: [607607.957192]  [<c0184900>] alloc_inode+0x12e/0x186
Jul 18 08:30:02 core kernel: [607607.957210]  [<c0184ce9>] iget_locked+0x5b/0x100
Jul 18 08:30:02 core kernel: [607607.957217]  [<d89b2bea>] ext3_lookup+0x21/0x9b [ext3]
Jul 18 08:30:02 core kernel: [607607.957228]  [<c017aac3>] do_lookup+0xb6/0x153
Jul 18 08:30:13 core kernel: [607607.957233]  [<c017c6c4>] __link_path_walk+0x726/0xb26
Jul 18 08:30:13 core kernel: [607607.957239]  [<c0186f4c>] mntput_no_expire+0x13/0xd9
Jul 18 08:30:13 core kernel: [607607.957243]  [<c017cafb>] path_walk+0x37/0x70
Jul 18 08:30:13 core kernel: [607607.957247]  [<c017cdaa>] do_path_lookup+0x122/0x184
Jul 18 08:30:13 core kernel: [607607.957251]  [<c017d607>] __user_walk_fd+0x29/0x3a
Jul 18 08:30:13 core kernel: [607607.957255]  [<c0177625>] vfs_lstat_fd+0x12/0x39
Jul 18 08:30:13 core kernel: [607607.957276]  [<c01776b9>] sys_lstat64+0xf/0x23
Jul 18 08:30:13 core kernel: [607607.957283]  [<c0103857>] sysenter_past_esp+0x78/0xb1
Jul 18 08:30:13 core kernel: [607607.957344]  =======================

最近,

Jun 30 07:44:11 core kernel: [2065298.377450] ionice        D 299741d5     0 32588  32441
Jun 30 07:44:11 core kernel: [2065298.377515]        ce11a5e0 00000086 02a1416f 299741d5 000755a5 ce11a76c c1209fc0 00000000
Jun 30 07:44:11 core kernel: [2065298.377578]        c38d5f6c 058eebe6 003d2086 00000000 058eebe6 c38d5f6c 058eebe6 c3b9fd08
Jun 30 07:44:11 core kernel: [2065298.377598]        c1209fc0 00e4f000 c3b9fd08 c12001cc c02b9048 c3b9fd00 00000000 c0190d2e
Jun 30 07:44:11 core kernel: [2065298.377612] Call Trace:
Jun 30 07:44:11 core kernel: [2065298.378275]  [<c02b9048>] io_schedule+0x49/0x80
Jun 30 07:44:11 core kernel: [2065298.379280]  [<c0190d2e>] sync_buffer+0x30/0x33
Jun 30 07:44:11 core kernel: [2065298.379325]  [<c02b9236>] __wait_on_bit+0x33/0x58
Jun 30 07:44:11 core kernel: [2065298.379331]  [<c0190cfe>] sync_buffer+0x0/0x33
Jun 30 07:44:11 core kernel: [2065298.379338]  [<c0190cfe>] sync_buffer+0x0/0x33
Jun 30 07:44:11 core kernel: [2065298.379342]  [<c02b92ba>] out_of_line_wait_on_bit+0x5f/0x67
Jun 30 07:44:11 core kernel: [2065298.379348]  [<c0131a91>] wake_bit_function+0x0/0x3c
Jun 30 07:44:11 core kernel: [2065298.379399]  [<c0190cca>] __wait_on_buffer+0x16/0x18
Jun 30 07:44:12 core kernel: [2065298.379415]  [<d09af08d>] ext3_bread+0x44/0x5b [ext3]
Jun 30 07:44:12 core kernel: [2065298.379680]  [<d09b0f50>] dx_probe+0x3a/0x2ad [ext3]
Jun 30 07:44:12 core kernel: [2065298.379692]  [<c01e046c>] rb_insert_color+0x4c/0xad
Jun 30 07:44:12 core kernel: [2065298.379741]  [<d09b1280>] ext3_find_entry+0xbd/0x515 [ext3]
Jun 30 07:44:12 core kernel: [2065298.379753]  [<c01344ec>] hrtimer_start+0xf7/0x110
Jun 30 07:44:12 core kernel: [2065298.379760]  [<c01361e0>] getnstimeofday+0x37/0xbc
Jun 30 07:44:12 core kernel: [2065298.379765]  [<c0134658>] ktime_get_ts+0x22/0x49
Jun 30 07:44:12 core kernel: [2065298.379769]  [<c0155174>] delayacct_end+0x70/0x77
Jun 30 07:44:12 core kernel: [2065298.379788]  [<c0156aee>] sync_page+0x0/0x36
Jun 30 07:44:12 core kernel: [2065298.379803]  [<c0155249>] __delayacct_blkio_end+0x56/0x59
Jun 30 07:44:12 core kernel: [2065298.379810]  [<c02b9063>] io_schedule+0x64/0x80
Jun 30 07:44:12 core kernel: [2065298.379816]  [<d09b2bea>] ext3_lookup+0x21/0x9b [ext3]
Jun 30 07:44:12 core kernel: [2065298.379827]  [<c017aac3>] do_lookup+0xb6/0x153
Jun 30 07:44:12 core kernel: [2065298.379847]  [<c017c6c4>] __link_path_walk+0x726/0xb26
Jun 30 07:44:12 core kernel: [2065298.379852]  [<c0131a49>] __wake_up_bit+0x29/0x2e
Jun 30 07:44:12 core kernel: [2065298.379857]  [<c01621a6>] __do_fault+0x30e/0x34d
Jun 30 07:44:12 core kernel: [2065298.379863]  [<c017cafb>] path_walk+0x37/0x70
Jun 30 07:44:12 core kernel: [2065298.379867]  [<c017cdaa>] do_path_lookup+0x122/0x184
Jun 30 07:44:12 core kernel: [2065298.379872]  [<c017d78c>] __path_lookup_intent_open+0x42/0x72
Jun 30 07:44:12 core kernel: [2065298.379878]  [<c017d80b>] path_lookup_open+0xf/0x13
Jun 30 07:44:12 core kernel: [2065298.379882]  [<c0177c98>] open_exec+0x1d/0x94
Jun 30 07:44:12 core kernel: [2065298.379900]  [<c0164be3>] free_pgtables+0x86/0x93
Jun 30 07:44:12 core kernel: [2065298.379906]  [<c0182b46>] dput+0x25/0xbb
Jun 30 07:44:12 core kernel: [2065298.379912]  [<c0178d13>] do_execve+0x48/0x1c6
Jun 30 07:44:12 core kernel: [2065298.379917]  [<c010213b>] sys_execve+0x2a/0x4a
Jun 30 07:44:12 core kernel: [2065298.379944]  [<c0103857>] sysenter_past_esp+0x78/0xb1
Jun 30 07:44:12 core kernel: [2065298.379984]  =======================

我要指出的是,ionice 实际上是被 mlocate cron 作业使用的。

编辑: 这个问题似乎是偶尔发生的——它可能每周一次导致机器完全死机,但随着正常运行时间的增加,情况似乎也变得更糟。我真的不想责怪 cron 作业,因为我通常在我安装和支持的几乎所有服务器上运行 debian lenny——这里没有什么不寻常的。可能是内存泄漏吗?我说它随着正常运行时间而“恶化”,因为我在 vmware 主机上运行 nagios,并且通常在 4-6 天后,我会在早上开始收到一分钟的负载警告,然后在第二天收到两分钟的负载警告。我一直试图在它发生时进行远程登录,但我就是无法在它发生时连接到客户虚拟机来查看还发生了什么。

答案1

也许 mlocate 是症状,但不是原因。服务器上还有其他 cron 作业吗?尝试删除它们(如果没有真的除了 mlocate 之外,其他操作(必需)并查看是否再次发生。服务器上是否已安装任何文件系统?

相关内容