通过解释“netstat -s”来分析 TCP 性能

通过解释“netstat -s”来分析 TCP 性能

我已经netstat -s在运行 Debian 的专用服务器上执行了。我想解释结果,因为我遇到了 TCP 连接问题。我不知道如何读取这些结果。有人能帮忙吗?

背景:这是一个公共 tcp 服务器,客户端来自世界各地,大多数使用 3G/UMTS 网络。套接字平均打开 1 小时。有些 tcp 链接每 10 分钟左右会停滞 10-60 秒。我正在运行一个自定义 java 程序,即 tcp 服务器。

以下是 的输出netstat -s。它是否显示出任何明显的连接问题?

    Ip:
        33780786 total packets received
        0 forwarded
        0 incoming packets discarded
        33780059 incoming packets delivered
        33577363 requests sent out
        1 outgoing packets dropped
        1442 reassemblies required
        715 packets reassembled ok
    Icmp:
        4675 ICMP messages received
        98 input ICMP message failed.
        ICMP input histogram:
            destination unreachable: 2901
            timeout in transit: 152
            echo requests: 1334
            echo replies: 226
        2109 ICMP messages sent
        0 ICMP messages failed
        ICMP output histogram:
            destination unreachable: 550
            echo request: 225
            echo replies: 1334
    IcmpMsg:
            InType0: 226
            InType3: 2901
            InType8: 1334
            InType11: 152
            OutType0: 1334
            OutType3: 550
            OutType8: 225
    Tcp:
        8752 active connections openings
        287296 passive connection openings
        58164 failed connection attempts
        74065 connection resets received
        30 connections established
        32997886 segments received
        32357425 segments send out
        438184 segments retransmited
        587 bad segments received.
        75868 resets sent
    Udp:
        777245 packets received
        550 packets to unknown port received.
        0 packet receive errors
        779944 packets sent
    TcpExt:
        28674 invalid SYN cookies received
        56570 resets received for embryonic SYN_RECV sockets
        998 packets pruned from receive queue because of socket buffer overrun
        9 ICMP packets dropped because they were out-of-window
        27402 packets rejects in established connections because of timestamp
        1266543 delayed acks sent
        1399 delayed acks further delayed because of locked socket
        Quick ack mode was activated 143367 times
        1556 times the listen queue of a socket overflowed
        1556 SYNs to LISTEN sockets dropped
        25884635 packets directly queued to recvmsg prequeue.
        785180902 bytes directly in process context from backlog
        1800599695 bytes directly received in process context from prequeue
        2879633 packet headers predicted
        7627605 packets header predicted and directly queued to user
        3218508 acknowledgments not containing data payload received
        14774120 predicted acknowledgments
        52 times recovered from packet loss due to fast retransmit
        24519 times recovered from packet loss by selective acknowledgements
        4 bad SACK blocks received
        Detected reordering 146 times using FACK
        Detected reordering 77 times using SACK
        Detected reordering 2239 times using time stamp
        3548 congestion windows fully recovered without slow start
        15840 congestion windows partially recovered using Hoe heuristic
        8832 congestion windows recovered without slow start by DSACK
        127403 congestion windows recovered without slow start after partial ack
        12080 TCP data loss events
        TCPLostRetransmit: 3
        179 timeouts after reno fast retransmit
        21328 timeouts after SACK recovery
        1481 timeouts in loss state
        32373 fast retransmits
        5349 forward retransmits
        26402 retransmits in slow start
        230593 other TCP timeouts
        4 classic Reno fast retransmits failed
        2367 SACK retransmits failed
        563 times receiver scheduled too late for direct processing
        243774 packets collapsed in receive queue due to low socket buffer
        151068 DSACKs sent for old packets
        45306 DSACKs sent for out of order packets
        238987 DSACKs received
        14 DSACKs for out of order packets received
        27627 connections reset due to unexpected data
        4045 connections reset due to early user close
        4992 connections aborted due to timeout
    IpExt:

答案1

1 outgoing packets dropped

几乎没有数据包丢失,这很好,但我们没有延迟数据。乍一看,我会说你使用的工具不对。

是否涉及数据库?是否存在某种循环功能,导致系统在 10 分钟左右变慢?该机器是否仅运行此 tcp 服务器,还是正在为其他资源提供服务?

Netstat 并不是一个适合你做事的指标。为了确保你的 Web 应用程序按预期运行,你需要一个具有以下特征的基础设施

  • 挂钩到您的应用程序中以确保正确的指标。您是开发人员,因此您可以执行此操作,这将大大简化您的工作。我所说的挂钩是指用于获取诊断和性能数据的工具,直接编码到您的应用程序中。
  • 图形/监控基础设施。仙人掌纳吉奥斯是我熟悉的例子,但还有更多。
  • 制定计划。你想实现什么目标?你想为用户提供什么级别的服务?在开发应用程序时实施诊断和性能指标,如果你有灵感,这可能会变成一件大事,让它具有可扩展性。*真正*可扩展。

答案2

以下这些内容可以帮助您理解该问题:

  • 您的接收程序如何处理来自网络连接?它是多线程的吗?它如何处理客户端?是否达到超时?
  • 您如何测试服务器代码?您是否在本地机器上运行过它并测试过可以建立多少个连接?您是否测试过长会话的效果?
  • 尝试运行“netstat -p”或“lsof -i TCP”,看看发生了什么。发送队列是什么样子的?运行“ps auxwww”,服务器程序的状态是什么?

相关内容