我的服务器进入故障状态,因为数据库无法写入分区。我发现分区进入了只读模式。最后为了修复它,我不得不进行硬重启。
Linux 2.6.18-164.el5PAE #1 SMP 2009 年 8 月 18 日星期二 15:59:11 EDT i686 i686 i386 GNU/Linux
/var/log/消息
Oct 31 00:56:45 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network
Oct 31 00:57:05 ota3g1 Had[17275]: VCS CRITICAL V-16-1-50086 CPU usage on ota3g1.mtsallstream.com is 100%
Oct 31 01:01:47 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network
Oct 31 01:06:50 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network
Oct 31 01:11:52 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network
Oct 31 01:12:10 ota3g1 kernel: lpfc 0000:29:00.1: 1:1305 Link Down Event x2 received Data: x2 x20 x80000 x0 x0
Oct 31 01:12:10 ota3g1 kernel: lpfc 0000:29:00.1: 1:1303 Link Up Event x3 received Data: x3 x1 x10 x1 x0 x0 0
Oct 31 01:12:12 ota3g1 kernel: lpfc 0000:29:00.1: 1:1305 Link Down Event x4 received Data: x4 x20 x80000 x0 x0
Oct 31 01:12:40 ota3g1 kernel: rport-8:0-0: blocked FC remote port time out: saving binding
Oct 31 01:12:40 ota3g1 kernel: lpfc 0000:29:00.1: 1:(0):0203 Devloss timeout on WWPN 20:25:00:a0:b8:74:f5:65 NPort x0000e4 Data: x0 x7 x0
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 38617577
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283532153
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 90825
Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-16.
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 868841
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-10.
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37759889
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283349449
Oct 31 01:12:40 ota3g1 kernel: printk: 6 messages suppressed.
Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-12.
Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-12) in ext3_reserve_inode_write: Journal has aborted
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 1545
Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-16
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 12745
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-10, logical block 1545
Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-16) in ext3_reserve_inode_write: Journal has aborted
Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-10
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37749121
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 0
Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-12
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-12) in ext3_dirty_inode: Journal has aborted
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37757897
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 1097
Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-12
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283337089
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 0
Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-16
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-16) in ext3_dirty_inode: Journal has aborted
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37749121
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 0
Oct 31 01:12:41 ota3g1 kernel: lost page write due to I/O error on dm-12
Oct 31 01:12:41 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:41 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283337089
Oct 31 01:12:41 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 0
Oct 31 01:12:41 ota3g1 kernel: lost page write due to I/O error on dm-16
Oct 31 01:12:41 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
DF-H
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/cciss-root
4.9G 730M 3.9G 16% /
/dev/mapper/cciss-home
9.7G 1.2G 8.1G 13% /home
/dev/mapper/cciss-var
9.7G 494M 8.8G 6% /var
/dev/mapper/cciss-usr
15G 2.6G 12G 19% /usr
/dev/mapper/cciss-tmp
3.9G 153M 3.6G 5% /tmp
/dev/sda1 996M 43M 902M 5% /boot
tmpfs 5.9G 0 5.9G 0% /dev/shm
/dev/mapper/cciss-product
25G 16G 7.4G 68% /product
/dev/mapper/cciss-opt
20G 4.5G 14G 25% /opt
/dev/mapper/dg_db1-vol_db1_system
18G 2.2G 15G 14% /database/OTADB/sys
/dev/mapper/dg_db1-vol_db1_undo
18G 5.8G 12G 35% /database/OTADB/undo
/dev/mapper/dg_db1-vol_db1_redo
8.9G 4.3G 4.2G 51% /database/OTADB/redo
/dev/mapper/dg_db1-vol_db1_sgbd
8.9G 654M 7.8G 8% /database/OTADB/admin
/dev/mapper/dg_db1-vol_db1_arch
98G 24G 69G 26% /database/OTADB/arch
/dev/mapper/dg_db1-vol_db1_indexes
240G 14G 214G 6% /database/OTADB/index
/dev/mapper/dg_db1-vol_db1_data
275G 47G 215G 18% /database/OTADB/data
/dev/mapper/dg_dbrman-vol_db_rman
8.9G 351M 8.1G 5% /database/RMAN
/dev/mapper/dg_app1-vol_app1
151G 113G 31G 79% /files/ota
/etc/fstab
/dev/cciss/root / ext3 defaults 1 1
/dev/cciss/home /home ext3 defaults 1 2
/dev/cciss/var /var ext3 defaults 1 2
/dev/cciss/usr /usr ext3 defaults 1 2
/dev/cciss/tmp /tmp ext3 defaults 1 2
LABEL=/boot /boot ext3 defaults 1 2
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
/dev/cciss/swap swap swap defaults 0 0
/dev/cciss/product /product ext3 defaults 1 2
/dev/cciss/opt /opt ext3 defaults 1 2
/dev/dg_db1/vol_db1_system /database/OTADB/sys ext3 defaults 1 2
/dev/dg_db1/vol_db1_undo /database/OTADB/undo ext3 defaults 1 2
/dev/dg_db1/vol_db1_redo /database/OTADB/redo ext3 defaults 1 2
/dev/dg_db1/vol_db1_sgbd /database/OTADB/admin ext3 defaults 1 2
/dev/dg_db1/vol_db1_arch /database/OTADB/arch ext3 defaults 1 2
/dev/dg_db1/vol_db1_indexes /database/OTADB/index ext3 defaults 1 2
/dev/dg_db1/vol_db1_data /database/OTADB/data ext3 defaults 1 2
/dev/dg_dbrman/vol_db_rman /database/RMAN ext3 defaults 1 2
/dev/dg_app1/vol_app1 /files/ota ext3 defaults 1 2
谢谢大家的帮助。
答案1
您的 Emulex 卡 (logline: lpfc 0000:29:00.1: 1:1305
) 在其光纤通道端口上看到断开连接事件。因此,ext3 文件系统无法保存其日志。当连接再次建立时,您可能必须对它们进行 fsck。与所有硬断开连接事件一样,存在数据丢失的风险,但应该仅限于未刷新的脏页。
至于 Oracle(从 LVM 命名方案来看,它看起来神谕对我来说)环境中,我没有资格猜测你处于多大的热水中。