恢复 Cisco PIX 515e 的故障

恢复 Cisco PIX 515e 的故障

上周我们的数据中心停电了,当我们的双 PIX 515E 运行 IOS 7.0(8)(配置了故障转移电缆)时,它们处于故障转移状态,其中辅助设备处于活动状态,而主设备处于待机状态。我尝试了“故障转移重置”、“故障转移活动”和“故障转移重新加载待机”,并以各种顺序在两个设备上执行重新加载,但它们并没有恢复到主/活动辅助/待机状态。我唯一没有尝试过的方法是开车去数据中心并执行硬重启,我讨厌这样做。

我读过了思科安全防火墙的故障转移工作原理看起来这应该是非常直接的。

show failover主节点的输出:

Failover On 
Cable status: Normal
Failover unit Primary
Failover LAN Interface: N/A - Serial-based failover enabled
Unit Poll frequency 15 seconds, holdtime 45 seconds
Interface Poll frequency 15 seconds
Interface Policy 1
Monitored Interfaces 2 of 250 maximum
Version: Ours 7.0(8), Mate 7.0(8)
Last Failover at: 02:52:05 UTC Mar 10 2010
        This host: Primary - Standby Ready 
                Active time: 0 (sec)
                Interface outside (x.x.x.165): Normal 
                Interface inside (y.y.y.3): Normal 
        Other host: Secondary - Active 
                Active time: 897045 (sec)
                Interface outside (x.x.x.164): Normal 
                Interface inside (y.y.y.4): Normal 

Stateful Failover Logical Update Statistics
        Link : Unconfigured.

show failover次级侧的输出:

Failover On 
Cable status: Normal
Failover unit Secondary
Failover LAN Interface: N/A - Serial-based failover enabled
Unit Poll frequency 15 seconds, holdtime 45 seconds
Interface Poll frequency 15 seconds
Interface Policy 1
Monitored Interfaces 2 of 250 maximum
Version: Ours 7.0(8), Mate 7.0(8)
Last Failover at: 02:03:04 UTC Feb 28 2010
        This host: Secondary - Active 
                Active time: 896925 (sec)
                Interface outside (x.x.x.164): Normal 
                Interface inside (y.y.y.4): Normal 
        Other host: Primary - Standby Ready 
                Active time: 0 (sec)
                Interface outside (x.x.x.165): Normal 
                Interface inside (y.y.y.3): Normal 

Stateful Failover Logical Update Statistics
        Link : Unconfigured.

我在我的系统日志中看到以下内容:

Mar 10 03:05:00 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command. 
Mar 10 03:05:09 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reload-standby' command. 
Mar 10 03:05:12 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=20,my=Active,peer=Failed. 
Mar 10 03:05:12 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Failed. 
Mar 10 03:06:09 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=0,my=Active,peer=Failed. 
Mar 10 03:06:09 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is down. 
Mar 10 03:06:09 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=1,my=Active,peer=Failed. 
Mar 10 03:06:10 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is up. 
Mar 10 03:06:10 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=411,op=2,my=Active,peer=Failed. 
Mar 10 03:06:23 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=80,my=Active,peer=Standby Ready. 
Mar 10 03:06:23 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Standby Ready. 
Mar 10 03:06:24 fw2 %PIX-6-720027: (VPN-Primary) HA status callback: My state Standby Ready. 
Mar 10 03:07:05 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command. 
Mar 10 03:07:31 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover active' command. 
Mar 10 03:08:04 fw1 %PIX-5-611103: User logged out: Uname: enable_1 
Mar 10 03:08:04 fw1 %PIX-6-315011: SSH session from admin1_int on interface inside for user "pix" terminated normally 
Mar 10 03:08:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=20,my=Active,peer=Failed. 
Mar 10 03:08:39 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Failed. 
Mar 10 03:09:10 fw1 %PIX-6-605005: Login permitted from admin1_int/36891 to inside:192.168.4.4/ssh for user "pix" 
Mar 10 03:09:23 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command. 
Mar 10 03:09:38 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=0,my=Active,peer=Failed. 
Mar 10 03:09:39 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is down. 
Mar 10 03:09:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=1,my=Active,peer=Failed. 
Mar 10 03:09:39 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is up. 
Mar 10 03:09:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=411,op=2,my=Active,peer=Failed. 
Mar 10 03:09:52 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=80,my=Active,peer=Standby Ready. 
Mar 10 03:09:52 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Standby Ready. 
Mar 10 03:09:53 fw2 %PIX-6-720027: (VPN-Primary) HA status callback: My state Standby Ready.

我不太清楚如何解释该系统日志数据。主设备似乎甚至没有尝试变为活动设备。当我单独重新加载各个设备时,我的连接仍然保留,因此看起来似乎没有真正的硬件故障。我可以查询(IOS 或 SNMP)来检查硬件问题吗?

有什么想法吗?我的 IOS 功能很差。

感谢您提供的任何帮助,Aaron

答案1

请不要使用no failovernatacado 提到的命令。相反,请no failover active在辅助(当前处于活动状态)防火墙上使用该命令。第一个命令关闭故障转移;第二个命令将活动状态移交给 HA 对中的另一个防火墙。如果您运行failover active,请在主(当前处于备用状态)防火墙上运行它。

我不相信 PIX 提供了在主防火墙准备好再次处理流量时允许自动抢占的功能。

答案2

请发布您的故障转移配置(“show run failover”)。或者尝试启用抢占(您需要手动指定哪个单元是主单元,哪个单元是辅单元)。

答案3

至少对于 ASA5500 系列设备,您需要在 VPN-Primary 上运行以下命令:

no failover

这也应该适用于具有相对较新操作系统的 PIX。本质上,将其视为failover一个命令,该命令告诉单元尝试使辅助单元成为活动单元,并且像许多配置命令一样,no failover删除该操作。

答案4

无论如何,我们能够解决这个问题的唯一方法是关闭两个防火墙,然后按正确的顺序重新启动它们。以上所有建议都无法解决我的问题。不过,还是感谢大家的时间和帮助。

相关内容