ActiveMQ - 每2秒出现一次“无法发送,通道已失败”?

ActiveMQ - 每2秒出现一次“无法发送,通道已失败”?
  • ActiveMQ 5.7.0
  • CentOS 版本 6.3
  • Java 版本“1.6.0_26”
  • 应用程序服务器:Tomcat

在中activemq.log,我每 2 秒就会看到一次此异常:

2013-11-05 13:00:52,374 | DEBUG | Transport Connection to: tcp://127.0.0.1:37501 failed:
org.apache.activemq.transport.InactivityIOException: 
Cannot send, channel has already failed: tcp://127.0.0.1:37501 | org.apache.activemq.broker.TransportConnection.Transport | Async Exception Handler
org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://127.0.0.1:37501
        at org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:282)
        at org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:271)
        at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85)
        at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104)
        at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68)
        at org.apache.activemq.broker.TransportConnection.dispatch(TransportConnection.java:1312)
        at org.apache.activemq.broker.TransportConnection.processDispatch(TransportConnection.java:838)
        at org.apache.activemq.broker.TransportConnection.iterate(TransportConnection.java:873)
        at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129)
        at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

由于这个关键词InactivityIOException,我首先想到的是不活动监视器,但奇怪的是MaxInactivityDuration=30000

2013-11-05 13:11:02,672 | DEBUG | Sending: WireFormatInfo { version=9, properties={MaxFrameSize=9223372036854775807, 
CacheSize=1024, 
CacheEnabled=true, 
SizePrefixDisabled=false, 
MaxInactivityDurationInitalDelay=10000, 
TcpNoDelayEnabled=true, 
MaxInactivityDuration=30000, 
TightEncodingEnabled=true, 
StackTraceEnabled=true}, 
magic=[A,c,t,i,v,e,M,Q]} |    org.apache.activemq.transport.WireFormatNegotiator | ActiveMQ BrokerService[localhost] Task-2

而且,我还没有看到这样的事情:

自上次阅读后未收到任何消息检查...

或者:

频道处于非活动状态的时间过长(>30000)

执行 a netstat,我看到这些连接处于 TIME_WAIT 状态:

tcp        0      0 127.0.0.1:38545             127.0.0.1:61616             TIME_WAIT   -
tcp        0      0 127.0.0.1:38544             127.0.0.1:61616             TIME_WAIT   -
tcp        0      0 127.0.0.1:38522             127.0.0.1:61616             TIME_WAIT   -

请注意,“PID/程序名称”列仅包含破折号字符。

运行时的输出如下tcpdump

Internet Protocol Version 4, Src: 127.0.0.1 (127.0.0.1), Dst: 127.0.0.1 (127.0.0.1)
    Version: 4
    Header length: 20 bytes
    Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
        0000 00.. = Differentiated Services Codepoint: Default (0x00)
        .... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
    Total Length: 296
    Identification: 0x7b6a (31594)
    Flags: 0x02 (Don't Fragment)
        0... .... = Reserved bit: Not set
        .1.. .... = Don't fragment: Set
        ..0. .... = More fragments: Not set
    Fragment offset: 0
    Time to live: 64
    Protocol: TCP (6)
    Header checksum: 0xc063 [correct]
        [Good: True]
        [Bad: False]
    Source: 127.0.0.1 (127.0.0.1)
    Destination: 127.0.0.1 (127.0.0.1)

Transmission Control Protocol, Src Port: 61616 (61616), Dst Port: 54669 (54669), Seq: 1, Ack: 2, Len: 244
    Source port: 61616 (61616)
    Destination port: 54669 (54669)
    [Stream index: 11]
    Sequence number: 1    (relative sequence number)
    [Next sequence number: 245    (relative sequence number)]
    Acknowledgement number: 2    (relative ack number)
    Header length: 32 bytes
    Flags: 0x018 (PSH, ACK)
        000. .... .... = Reserved: Not set
        ...0 .... .... = Nonce: Not set
        .... 0... .... = Congestion Window Reduced (CWR): Not set
        .... .0.. .... = ECN-Echo: Not set
        .... ..0. .... = Urgent: Not set
        .... ...1 .... = Acknowledgement: Set
        .... .... 1... = Push: Set
        .... .... .0.. = Reset: Not set
        .... .... ..0. = Syn: Not set
        .... .... ...0 = Fin: Not set
    Window size value: 256
    [Calculated window size: 32768]
    [Window size scaling factor: 128]
    Checksum: 0xff1c [validation disabled]
        [Good Checksum: False]
        [Bad Checksum: False]
    Options: (12 bytes)
        No-Operation (NOP)
        No-Operation (NOP)
        Timestamps: TSval 2304161892, TSecr 2304161891
            Kind: Timestamp (8)
            Length: 10
            Timestamp value: 2304161892
            Timestamp echo reply: 2304161891
    [SEQ/ACK analysis]
        [Bytes in flight: 244]

Constrained Application Protocol, TID: 240, Length: 244
    00.. .... = Version: 0
    ..00 .... = Type: Confirmable (0)
    .... 0000 = Option Count: 0
    Code: Unknown (0)
    Transaction ID: 240
    Payload Content-Type: text/plain (default), Length: 240, offset: 4
        Line-based text data: text/plain
            [truncated] \001ActiveMQ\000\000\000\t\001\000\000\000<DE>\000\000\000\t\000\fMaxFrameSize\006\177<FF><FF><FF><FF>
<FF><FF><FF>\000\tCacheSize\005\000\000\004\000\000\fCacheEnabled\001\001\000\022SizePrefixDisabled\001\000\000 MaxInactivityDurationInitalDelay\006\

这很可能是tcp 端口检查。这是我从另一台主机尝试 telnet 时看到的内容:

2013-11-05 16:12:41,071 | DEBUG | Transport Connection to: tcp://10.8.20.9:46775 failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///10.8.20.9:46775@61616
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:375)
        at org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:275)
        at org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:229)
        at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:221)
        at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:204)
        at java.lang.Thread.run(Thread.java:662)
2013-11-05 16:12:41,071 | DEBUG | Transport Connection to: tcp://10.8.20.9:46775 failed: org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection.Transport | Async Exception Handler

org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://10.8.20.9:46775
        at org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:282)
        at org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:271)
        at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85)
        at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104)
        at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68)
        at org.apache.activemq.broker.TransportConnection.dispatch(TransportConnection.java:1312)
        at org.apache.activemq.broker.TransportConnection.processDispatch(TransportConnection.java:838)
        at org.apache.activemq.broker.TransportConnection.iterate(TransportConnection.java:873)
        at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129)
        at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
2013-11-05 16:12:41,071 | DEBUG | Unregistering MBean org.apache.activemq:BrokerName=localhost,Type=Connection,ConnectorName=ope
nwire,ViewType=address,Name=tcp_//10.8.20.9_46775 | org.apache.activemq.broker.jmx.ManagementContext | ActiveMQ Transport: tcp:/
//10.8.20.9:46775@61616
2013-11-05 16:12:41,073 | DEBUG | Stopping connection: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | 
ActiveMQ BrokerService[localhost] Task-5
2013-11-05 16:12:41,073 | DEBUG | Stopping transport tcp:///10.8.20.9:46775@61616 | org.apache.activemq.transport.tcp.TcpTranspo
rt | ActiveMQ BrokerService[localhost] Task-5
2013-11-05 16:12:41,073 | DEBUG | Initialized TaskRunnerFactory[ActiveMQ Task] using ExecutorService: java.util.concurrent.Threa
dPoolExecutor@23cc2a28 | org.apache.activemq.thread.TaskRunnerFactory | ActiveMQ BrokerService[localhost] Task-5
2013-11-05 16:12:41,074 | DEBUG | Closed socket Socket[addr=/10.8.20.9,port=46775,localport=61616] | org.apache.activemq.transpo
rt.tcp.TcpTransport | ActiveMQ Task-1
2013-11-05 16:12:41,074 | DEBUG | Forcing shutdown of ExecutorService: java.util.concurrent.ThreadPoolExecutor@23cc2a28 | org.apache.activemq.util.ThreadPoolUtils | ActiveMQ BrokerService[localhost] Task-5
2013-11-05 16:12:41,074 | DEBUG | Stopped transport: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[localhost] Task-5
2013-11-05 16:12:41,074 | DEBUG | Connection Stopped: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[localhost] Task-5
2013-11-05 16:12:41,902 | DEBUG | Sending: WireFormatInfo { version=9, properties={MaxFrameSize=9223372036854775807, CacheSize=1024, CacheEnabled=true, SizePrefixDisabled=false, MaxInactivityDurationInitalDelay=10000, TcpNoDelayEnabled=true, MaxInactivityDuration=30000, TightEncodingEnabled=true, StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]} | org.apache.activemq.transport.WireFormatNegotiator | ActiveMQ BrokerService[localhost] Task-5

所以问题是:我如何找出每 2 秒尝试连接我的 ActiveMQ(从本地主机)的进程?


更新时间:2013 年 11 月 7 日星期四 10:44:09 ICT

我的同事已经跟我谈过罪魁祸首了,那就是......保持活跃

vrrp_script chk_SC_ACTIVEMQ {
        script "</dev/tcp/127.0.0.1/61616"
        interval 2                      # check every 2 seconds
        weight 2                        # add 2 points of prio if OK
}

但我的问题仍然存在:假设我知道某人(在本地主机上)每 2 秒打开一次与我的 ActiveMQ 的 TCP 连接:

tcp        0      0 127.0.0.1:59408         127.0.0.1:61616         TIME_WAIT   -               
tcp        0      0 127.0.0.1:59398         127.0.0.1:61616         TIME_WAIT   -               
tcp        0      0 127.0.0.1:59399         127.0.0.1:61616         TIME_WAIT   -    

我该如何找出肇事者?我刚刚想到了一个办法,那就是扫描/etc(甚至整个系统)以找出所有包含“61616”的文件:

# grep -lr 61616 /etc

有没有更好的办法?

相关内容