我遇到了一个问题,我的两个应用服务器每次重启都要花大约一个小时。我认为每次都会进行 fsck,但我在日志中找不到太多信息来帮助调试。
在 dmesg 中我看到以下内容:
[ 8.081130] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
[ 10.776403] tg3 0000:02:00.0 eth0: Link is up at 1000 Mbps, full duplex
[ 10.776409] tg3 0000:02:00.0 eth0: Flow control is off for TX and off for RX
[ 10.776413] tg3 0000:02:00.0 eth0: EEE is enabled
[ 10.776448] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 11.771117] tg3 0000:02:00.1 eth1: Link is up at 1000 Mbps, full duplex
[ 11.771124] tg3 0000:02:00.1 eth1: Flow control is off for TX and off for RX
[ 11.771127] tg3 0000:02:00.1 eth1: EEE is enabled
[ 11.771173] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[ 3171.529473] FS-Cache: Loaded
[ 3171.549333] RPC: Registered named UNIX socket transport module.
[ 3171.549334] RPC: Registered udp transport module.
[ 3171.549335] RPC: Registered tcp transport module.
[ 3171.549335] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 3171.566048] init: failsafe main process (1506) killed by TERM signal
[ 3171.576544] FS-Cache: Netfs 'nfs' registered for caching
[ 3171.685710] Installing knfsd (copyright (C) 1996 [email protected]).
[ 3171.871112] audit_printk_skb: 6 callbacks suppressed
这与启动所花费的时间相同。
我想知道是否有人知道从哪里开始调查?
这些机器是:Dell Poweredge R420 Ubuntu 14.04
lsblk:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 558.4G 0 disk
├─sda1 8:1 0 243M 0 part /boot
├─sda2 8:2 0 1K 0 part
└─sda5 8:5 0 558.1G 0 part
├─app2-root (dm-0) 252:0 0 542.2G 0 lvm /
└─app2-swap_1 (dm-1) 252:1 0 15.9G 0 lvm [SWAP]
sr0 11:0 1 1024M 0 rom
fstab:
proc /proc proc nodev,noexec,nosuid 0 0
/dev/mapper/app2-root / ext4 errors=remount-ro 0 1
UUID=8a2c24e5-17ba-4992-82a1-68b9609b6983 /boot ext2 defaults 0 2
/dev/mapper/hd1app2-swap_1 none swap sw 0 0
编辑
事实证明这与 fsck 完全无关。问题是 /tmp 中有 GB 的数据,每次启动时都会清除这些数据。请注意日志中 11.771173 和 3171.529473 之间的差距。遗憾的是 dmesg 没有显示任何对这种特定情况有用的信息。