我们的配置

我们的配置

由于某种原因,我们发现 nginx 无法响应标准文件流量。到目前为止,我们发现的唯一解决方案是添加更多节点,目前最多 23 个虚拟机。但是,没有一个虚拟机处于繁忙状态,nginx 仍然无法完全应对。

我们的 http 和 https 服务没有 javascript,只是一些可以从世界各地访问的 PB 文件,用于组学内容的开放数据。

我们没有钱购买许可证,一切都是开源的,数据也是如此,我来这里是希望有人可以告诉我们,我们犯了一个巨大的错误,以便我们能够相对快速地修复它并停止向这个 http 服务投入更多服务器。

我将展示我们的配置,当我们在 23 台服务器上分配负载时,然后我将负载集中在一台服务器上,然后我将其设置回 23。

我们的配置


[root@hlvlpxfer-http-ebi-002 nginx]# uname -a
Linux hlvlpxfer-http-ebi-002.ebi.ac.uk 5.4.17-2136.300.7.el8uek.x86_64 #2 SMP Fri Oct 8 16:23:01 PDT 2021 x86_64 x86_64 x86_64 GNU/Linux
[root@hlvlpxfer-http-ebi-002 nginx]# lsb_release -a
LSB Version:    :core-4.1-amd64:core-4.1-noarch
Distributor ID: OracleServer
Description:    Oracle Linux Server release 8.5
Release:        8.5
Codename:       n/a
[root@hlvlpxfer-http-ebi-002 nginx]# nginx -V
nginx version: nginx/1.14.1
built by gcc 8.2.1 20180905 (Red Hat 8.2.1-3.0.1) (GCC)
built with OpenSSL 1.1.1 FIPS  11 Sep 2018 (running with OpenSSL 1.1.1k  FIPS 25 Mar 2021)
TLS SNI support enabled
configure arguments: --prefix=/usr/share/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/tmp/client_body --http-proxy-temp-path=/var/lib/nginx/tmp/proxy --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --http-scgi-temp-path=/var/lib/nginx/tmp/scgi --pid-path=/run/nginx.pid --lock-path=/run/lock/subsys/nginx --user=nginx --group=nginx --with-file-aio --with-ipv6 --with-http_ssl_module --with-http_v2_module --with-http_realip_module --with-http_addition_module --with-http_xslt_module=dynamic --with-http_image_filter_module=dynamic --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_degradation_module --with-http_slice_module --with-http_stub_status_module --with-http_perl_module=dynamic --with-http_auth_request_module --with-mail=dynamic --with-mail_ssl_module --with-pcre --with-pcre-jit --with-stream=dynamic --with-stream_ssl_module --with-debug --with-cc-opt='-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -Wl,-E'
[root@hlvlpxfer-http-ebi-002 nginx]#

nginx.conf 和 conf.d/common*

 cat /etc/nginx/nginx.conf
user nginx;

worker_processes auto;
worker_rlimit_nofile 999999;

#error_log /var/log/nginx/error-debug.log debug;
pid /run/nginx.pid;

# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 32768;
    use epoll;
    multi_accept on;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    log_format compression '$http_x_real_ip - $remote_user [$time_local] '
    '"$request" $status $body_bytes_sent '
    '"$http_referer" "$gzip_ratio"';

    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
    '$status $body_bytes_sent "$http_referer" '
    '"$http_user_agent" "$http_x_forwarded_for"';

    log_format ebi-logs '$remote_addr - $remote_user [$time_local] "$request" '
    '$status $body_bytes_sent "$http_referer" ';

    access_log /var/log/nginx/access.log main;

    chunked_transfer_encoding off;
    client_body_buffer_size 32k;
    client_body_timeout 11;

    # Gzip
    gunzip on;
    gzip on;
    gzip_buffers 16 8k;
    gzip_comp_level 6;
    gzip_disable "msie6";
    gzip_http_version 1.0;
    gzip_min_length 10240;
    gzip_proxied any;
    gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript application/vnd.ms-fontobject application/x-font-ttf font/opentype image/svg+xml image/x-icon;
    gzip_vary on;

    keepalive_requests 100000;
    keepalive_timeout 30;
    open_file_cache max=200000 inactive=20s;
    open_file_cache_errors on;
    open_file_cache_min_uses 2;
    open_file_cache_valid 30s;
    reset_timedout_connection on;
    send_timeout 11;
    sendfile off;
    sendfile_max_chunk 512k;
    server_names_hash_bucket_size 256;
    server_tokens off;
    tcp_nodelay on;

    server {
# SRA
listen 80;
listen 443 ssl;
server_name ftp.sra.ebi.ac.uk;

      access_log /var/log/nginx/xfer-ftp.sra.ebi.ac.uk.log ebi-logs;
      include /etc/nginx/conf.d/common-server.conf;
    }

    server {
# Default
listen 80 default_server;
listen 443 default_server ssl;
server_name localhost;

      access_log /var/log/nginx/xfer-ftp.ebi.ac.uk.log ebi-logs;
      include /etc/nginx/conf.d/common-server.conf;
    }

}
[root@hlvlpxfer-http-ebi-002 nginx]# cat /etc/nginx/conf.d/common-server.conf
    root /xfer/public/;

    ssl_certificate      /etc/pki/tls/certs/ftp.ebi.ac.uk.crt;
    ssl_certificate_key  /etc/pki/tls/private/ftp.ebi.ac.uk.key;
    ssl_protocols        SSLv3 TLSv1 TLSv1.1 TLSv1.2;
    ssl_ciphers          ALL:!aNULL:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP;

    location /nginx_status {
        stub_status on;
        access_log off;
        allow 127.0.0.1;
        deny all;
    }

    location ~* "\.(json|txt)z$" {
        add_header Content-Encoding gzip;
        gzip off;
        types {
            application/json jsonz;
        }
    }

    location / {
        root /xfer/public/;
        autoindex on;

        max_ranges 30;
        sendfile_max_chunk 512k;
        sendfile on;

        add_header 'Access-Control-Allow-Origin' '*';
        if ($request_method = 'OPTIONS') {
            add_header 'Access-Control-Allow-Origin' '*';
            add_header 'Access-Control-Allow-Methods' 'GET,OPTIONS';
            #
            # Custom headers and headers various browsers *should* be OK with but aren't
            #
            add_header 'Access-Control-Allow-Headers' 'Authorization,Origin,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Accept';
            #
            # Tell client that this pre-flight info is valid for 20 days
            #
            add_header 'Access-Control-Max-Age' 1728000;
            return 200;
        }
    }

负载(或负载不足)


[root@hlvlpxfer-http-ebi-002 nginx]# uptime
 20:54:26 up  3:56,  1 user,  load average: 0.04, 0.10, 0.09
[root@hlvlpxfer-http-ebi-002 nginx]# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):           4
NUMA node(s):        1
Vendor ID:           GenuineIntel
BIOS Vendor ID:      GenuineIntel
CPU family:          6
Model:               58
Model name:          Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
BIOS Model name:     Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
Stepping:            0
CPU MHz:             2297.339
BogoMIPS:            4594.67
Hypervisor vendor:   VMware
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            46080K
NUMA node0 CPU(s):   0-3
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cpuid_fault pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust smep arat md_clear flush_l1d arch_capabilities
[root@hlvlpxfer-http-ebi-002 nginx]# ps axu|grep nginx
root        9122  0.0  0.0 104148  2344 ?        Ss   17:37   0:00 nginx: master process /usr/sbin/nginx
nginx       9123  0.1  0.0 151264 26948 ?        S    17:37   0:12 nginx: worker process
nginx       9124  0.2  0.0 150544 26456 ?        S    17:37   0:24 nginx: worker process
nginx       9125  0.1  0.0 151388 27272 ?        S    17:37   0:21 nginx: worker process
nginx       9126  0.1  0.0 150656 26540 ?        S    17:37   0:12 nginx: worker process
root       39099  0.0  0.0 221924  1136 pts/0    R+   20:54   0:00 grep --color=auto nginx

nginx_status

[root@hlvlpxfer-http-ebi-002 nginx]# curl 127.0.0.1/nginx_status
Active connections: 14
server accepts handled requests
 11524 11524 33976
Reading: 0 Writing: 9 Waiting: 5

正常运行时间和免费

[root@hlvlpxfer-http-ebi-002 nginx]# uptime
 20:54:57 up  3:57,  1 user,  load average: 0.02, 0.09, 0.08
[root@hlvlpxfer-http-ebi-002 nginx]# free
              total        used        free      shared  buff/cache   available
Mem:       32560712     4267664      236280        9160    28056768    27854040
Swap:       2097148       12044     2085104
[root@hlvlpxfer-http-ebi-002 nginx]# free -m
              total        used        free      shared  buff/cache   available
Mem:          31797        4175         250           8       27371       27192
Swap:          2047          11        2036
[root@hlvlpxfer-http-ebi-002 nginx]#

施加负载前

[root@hlvlpxfer-http-ebi-002 nginx]# # No load
[root@hlvlpxfer-http-ebi-002 nginx]# echo "START"; date ; curl 127.0.0.1/nginx_status ; uptime ; free -m ; ps axu|grep nginx ; echo "END " ; date
START
Fri Jul 29 21:05:36 BST 2022
Active connections: 52
server accepts handled requests
 12161 12161 37508
Reading: 0 Writing: 16 Waiting: 36
 21:05:38 up  4:08,  1 user,  load average: 0.16, 0.20, 0.13
              total        used        free      shared  buff/cache   available
Mem:          31797        4136         228           8       27432       27231
Swap:          2047          11        2036
root        9122  0.0  0.0 104148  2344 ?        Ss   17:37   0:00 nginx: master process /usr/sbin/nginx
nginx       9123  0.1  0.0 151264 26948 ?        S    17:37   0:13 nginx: worker process
nginx       9124  0.2  0.0 150544 26456 ?        S    17:37   0:26 nginx: worker process
nginx       9125  0.1  0.0 151388 27272 ?        S    17:37   0:21 nginx: worker process
nginx       9126  0.1  0.0 150656 26540 ?        S    17:37   0:13 nginx: worker process
root       40755  0.0  0.0 221924  1152 pts/0    S+   21:05   0:00 grep --color=auto nginx
END
Fri Jul 29 21:05:38 BST 2022
[root@hlvlpxfer-http-ebi-002 nginx]#

施加负载后 5 分钟(负载均衡器仅关注一台服务器)

几分钟后,LoadBalancer 开始发现服务器“没有响应”,并开始向客户端发送“连接重置”。负载平衡器是标准硬件,正在检查 robots.txt 文件,由于 nginx 无法快速返回文件,因此会超时。

[root@hlvlpxfer-http-ebi-002 nginx]# sleep 240 ; echo "START"; date ; curl 127.0.0.1/nginx_status ; uptime ; free -m ; ps axu|grep nginx ; echo "END " ; date
START
Fri Jul 29 21:11:45 BST 2022
Active connections: 1588
server accepts handled requests
 15069 15069 38953
Reading: 0 Writing: 109 Waiting: 1479
 21:13:13 up  4:15,  1 user,  load average: 0.15, 0.11, 0.09
              total        used        free      shared  buff/cache   available
Mem:          31797        4264         232           8       27300       27102
Swap:          2047          11        2036
root        9122  0.0  0.0 104148  2344 ?        Ss   17:37   0:00 nginx: master process /usr/sbin/nginx
nginx       9123  0.1  0.0 151264 26948 ?        S    17:37   0:13 nginx: worker process
nginx       9124  0.2  0.0 150976 26712 ?        S    17:37   0:26 nginx: worker process
nginx       9125  0.1  0.0 151388 27272 ?        S    17:37   0:22 nginx: worker process
nginx       9126  0.1  0.0 150656 26540 ?        S    17:37   0:13 nginx: worker process
root       41198  0.0  0.0 301520 16796 ?        S    21:08   0:00 python3 nginx_status.py
root       41343  0.0  0.0 301520 16860 ?        S    21:09   0:00 python3 nginx_status.py
root       41492  0.0  0.0 301520 16704 ?        S    21:10   0:00 python3 nginx_status.py
root       41927  0.0  0.0 301520 16740 ?        S    21:13   0:00 python3 nginx_status.py
root       41933  0.0  0.0 221924  1128 pts/0    S+   21:13   0:00 grep --color=auto nginx
END
Fri Jul 29 21:13:13 BST 2022
[root@hlvlpxfer-http-ebi-002 nginx]#

谢谢你的时间。

根据要求,虚拟机具有以下设置:

  • 20 台虚拟机运行 kuberentes、traefik > nginx
  • 3 台直接运行 nginx 的虚拟机(OEL8.5)
  • 所有 nginx 在任何地方都有相同的配置。
  • 负载均衡器是 F5 ,在所有负载均衡器上进行循环分发。

23 台虚拟机在 3 个 Vmware 虚拟机管理程序上运行,3 台具有 OEL8.5 的虚拟机具有亲和性规则,可在每个虚拟机管理程序上运行一个虚拟机。

Vmware集群不繁忙:

英特尔 (R) 至强 (R) CPU E5-2699 v3 @ 2.30GHz

72 核

VMware 虚拟机管理程序利用率较低

虚拟机管理程序的网络似乎不是问题:

1. 在此处输入图片描述 2. 在此处输入图片描述 3. 在此处输入图片描述

相关内容