NGINX 内部负载平衡 + PHP-FPM 上游导致随机双重请求/提交

NGINX 内部负载平衡 + PHP-FPM 上游导致随机双重请求/提交

我们遇到了一个非常严重的问题,在看似随机的时间,我们会收到由我们的应用程序处理的重复请求。通常,用户提交表单,有时会保存其内容两次。

我们已经排除了该问题是由 JS 驱动的重复提交的可能性。我们有一个网络分析器证明只发出了 1 个请求。但是,我们还证明 PHP 应用程序确实被完整地执行了两次。经过彻底调查,应用程序中不存在会导致这种重复保存行为的逻辑问题。

编辑:我们已从 NGINX conf 中删除“keepalive 8;”行,我们不再收到重复提交。相反,我们在有问题的请求期间收到 504

如果有人能看到下面的内容并发现任何突出之处,我将不胜感激 - 谢谢!

我们的 NGINX 和 PHP-FPM 配置如下:

/etc/nginx/nginx.conf

user nginx;
worker_processes  1;
worker_rlimit_nofile 10240;

error_log  /var/log/nginx/error.log;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
    multi_accept on;
    use epoll;
}

http {

    server_tokens off;
    add_header 'Access-Control-Allow-Origin' http://$host;
        add_header 'Access-Control-Allow-Methods' 'GET, POST';
    add_header 'X-Powered-By' 'smartCMS';

    upstream php_fpm {
        least_conn;
        server 127.0.0.1:9000 max_fails=3 fail_timeout=15s;
        keepalive 8;
    }

    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_not_found off;
    access_log    /var/log/nginx/access.log combined buffer=16k;

    open_file_cache max=200000 inactive=20s;
    open_file_cache_valid 30s;
    open_file_cache_min_uses 2;
    open_file_cache_errors on;

    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;

    keepalive_requests 200;
    keepalive_timeout  65;

    gzip  on;
    gzip_static  on;
    gzip_http_version 1.0;
    gzip_comp_level 6;
    gzip_proxied any;
    gzip_types application/javascript application/x-javascript application/xhtml+xml application/xml application/xml+rss image/svg+xml text/css text/javascript text/plain text/xml;
    gzip_vary on;
    gzip_disable "MSIE [1-6].(?!.*SV1)";

    client_max_body_size 12m;
    client_body_buffer_size 128k;
    client_body_timeout 60;
    client_header_timeout 10;
    large_client_header_buffers 4 16k;
    send_timeout 60;

    server_names_hash_bucket_size 64;

    include /etc/nginx/conf.d/*.conf;
 include /etc/nginx/sites-enabled/*;
 }

/etc/php-fpm.conf

;;;;;;;;;;;;;;;;;;;;;
; FPM Configuration ;
;;;;;;;;;;;;;;;;;;;;;

; All relative paths in this configuration file are relative to PHP's install
; prefix.

; Include one or more files. If glob(3) exists, it is used to include a bunch of
; files from a glob(3) pattern. This directive can be used everywhere in the
; file.
    include=/etc/php-fpm.d/pools/*.conf

 ;;;;;;;;;;;;;;;;;;
 ;  PHP INI  ;
 ;;;;;;;;;;;;;;;;;;
 php_admin_value[upload_max_filesize] = 10M;
 php_admin_value[post_max_size] = 12M;
 php_admin_value[max_execution_time] = 60;
 php_admin_value[expose_php] = Off;

 ;;;;;;;;;;;;;;;;;;
 ; Global Options ;
 ;;;;;;;;;;;;;;;;;;

 [global]
 ; Pid file
 ; Default Value: none
 pid = /var/run/php-fpm/php-fpm.pid

 ; Error log file
 ; Default Value: /var/log/php-fpm.log
 error_log = /var/log/php-fpm/error.log

 ; Log level
 ; Possible Values: alert, error, warning, notice, debug
 ; Default Value: notice
 log_level = warning

 ; If this number of child processes exit with SIGSEGV or SIGBUS within the time
 ; interval set by emergency_restart_interval then FPM will restart. A value
 ; of '0' means 'Off'.
 ; Default Value: 0
 emergency_restart_threshold = 1

 ; Interval of time used by emergency_restart_interval to determine when
 ; a graceful restart will be initiated.  This can be useful to work around
 ; accidental corruptions in an accelerator's shared memory.
 ; Available Units: s(econds), m(inutes), h(ours), or d(ays)
 ; Default Unit: seconds
 ; Default Value: 0
 emergency_restart_interval = 1m

 ; Time limit for child processes to wait for a reaction on signals from master.
 ; Available units: s(econds), m(inutes), h(ours), or d(ays)
 ; Default Unit: seconds
 ; Default Value: 0
 process_control_timeout = 60s

 ; Send FPM to background. Set to 'no' to keep FPM in foreground for debugging.
 ; Default Value: yes
 daemonize = yes

 ;;;;;;;;;;;;;;;;;;;;
 ; Pool Definitions ;
 ;;;;;;;;;;;;;;;;;;;;

 ; See /etc/php-fpm.d/pools/*.conf

/etc/php-fpm.d/pools/www0.conf

; Start a new pool named 'www0'.
    [www0]

; pool_id0php_fpm_service_namephp-fpmtemplatepool.conf.erbnamewwwenabletrue
; The address on which to accept FastCGI requests.
; Valid syntaxes are:
    ;   'ip.add.re.ss:port'    - to listen on a TCP socket to a specific address on
;                            a specific port;
;   'port'                 - to listen on a TCP socket to all addresses on a
;                            specific port;
;   '/path/to/unix/socket' - to listen on a unix socket.
; Note: This value is mandatory.
    listen = 127.0.0.1:9000

; Set listen(2) backlog. A value of '-1' means unlimited.
; Default Value: -1
listen.backlog = 4096

; List of ipv4 addresses of FastCGI clients which are allowed to connect.
; Equivalent to the FCGI_WEB_SERVER_ADDRS environment variable in the original
; PHP FCGI (5.2.2+). Makes sense only with a tcp listening socket. Each address
; must be separated by a comma. If this value is left blank, connections will be
; accepted from any ip address.
; Default Value: any
listen.allowed_clients = 127.0.0.1

; Set permissions for unix socket, if one is used. In Linux, read/write
; permissions must be set in order to allow connections from a web server. Many
; BSD-derived systems allow connections regardless of permissions.
; Default Values: user and group are set as the running user
;                 mode is set to 0666
;listen.owner = nobody
;listen.group = nobody
;listen.mode = 0666

listen.owner = nginx
listen.group = nginx
listen.mode = 0660

; Unix user/group of processes
; Note: The user is mandatory. If the group is not set, the default user's group
;       will be used.
; RPM: apache Choosed to be able to access some dir as httpd
user = nginx
; RPM: Keep a group allowed to write in log dir.
    group = nginx

; Choose how the process manager will control the number of child processes.
; Possible Values:
    ;   static  - a fixed number (pm.max_children) of child processes;
;   dynamic - the number of child processes are set dynamically based on the
;             following directives:
    ;             pm.max_children      - the maximum number of children that can
;                                    be alive at the same time.
;             pm.start_servers     - the number of children created on startup.
;             pm.min_spare_servers - the minimum number of children in 'idle'
;                                    state (waiting to process). If the number
;                                    of 'idle' processes is less than this
;                                    number then some children will be created.
;             pm.max_spare_servers - the maximum number of children in 'idle'
;                                    state (waiting to process). If the number
;                                    of 'idle' processes is greater than this
;                                    number then some children will be killed.
; Note: This value is mandatory.
    pm = static

; The number of child processes to be created when pm is set to 'static' and the
; maximum number of child processes to be created when pm is set to 'dynamic'.
; This value sets the limit on the number of simultaneous requests that will be
; served. Equivalent to the ApacheMaxClients directive with mpm_prefork.
    ; Equivalent to the PHP_FCGI_CHILDREN environment variable in the original PHP
; CGI.
; Note: Used when pm is set to either 'static' or 'dynamic'
; Note: This value is mandatory.
    pm.max_children = 48


; The number of requests each child process should execute before respawning.
; This can be useful to work around memory leaks in 3rd party libraries. For
; endless request processing specify '0'. Equivalent to PHP_FCGI_MAX_REQUESTS.
; Default Value: 0
pm.max_requests = 10000

; The URI to view the FPM status page. If this value is not set, no URI will be
; recognized as a status page. By default, the status page shows the following
; information:
    ;   accepted conn    - the number of request accepted by the pool;
;   pool             - the name of the pool;
;   process manager  - static or dynamic;
;   idle processes   - the number of idle processes;
;   active processes - the number of active processes;
;   total processes  - the number of idle + active processes.
; The values of 'idle processes', 'active processes' and 'total processes' are
; updated each second. The value of 'accepted conn' is updated in real time.
; Example output:
    ;   accepted conn:   12073
;   pool:             www
;   process manager:  static
;   idle processes:   35
;   active processes: 65
;   total processes:  100
; By default the status page output is formatted as text/plain. Passing either
; 'html' or 'json' as a query string will return the corresponding output
; syntax. Example:
;   http://www.foo.bar/status
    ;   http://www.foo.bar/status?json
    ;   http://www.foo.bar/status?html
    ; Note: The value must start with a leading slash (/). The value can be
;       anything, but it may not be a good idea to use the .php extension or it
;       may conflict with a real PHP file.
; Default Value: not set
;pm.status_path = /status

; The ping URI to call the monitoring page of FPM. If this value is not set, no
; URI will be recognized as a ping page. This could be used to test from outside
; that FPM is alive and responding, or to
; - create a graph of FPM availability (rrd or such);
; - remove a server from a group if it is not responding (load balancing);
; - trigger alerts for the operating team (24/7).
; Note: The value must start with a leading slash (/). The value can be
;       anything, but it may not be a good idea to use the .php extension or it
;       may conflict with a real PHP file.
; Default Value: not set
;ping.path = /ping

; This directive may be used to customize the response of a ping request. The
; response is formatted as text/plain with a 200 response code.
; Default Value: pong
;ping.response = pong

; The timeout for serving a single request after which the worker process will
; be killed. This option should be used when the 'max_execution_time' ini option
; does not stop script execution for some reason. A value of '0' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
request_terminate_timeout = 60s

; The timeout for serving a single request after which a PHP backtrace will be
; dumped to the 'slowlog' file. A value of '0s' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
request_slowlog_timeout = 20s

; The log file for slow requests
; Default Value: not set
; Note: slowlog is mandatory if request_slowlog_timeout is set
slowlog = /var/log/php-fpm/www-slow.log

; Set open file descriptor rlimit.
; Default Value: system defined value
;rlimit_files = 1024

; Set max core size rlimit.
; Possible Values: 'unlimited' or an integer greater or equal to 0
; Default Value: system defined value
;rlimit_core = 0

; Chroot to this directory at the start. This value must be defined as an
; absolute path. When this value is not set, chroot is not used.
; Note: chrooting is a great security feature and should be used whenever
;       possible. However, all PHP paths will be relative to the chroot
;       (error_log, sessions.save_path, ...).
; Default Value: not set
;chroot =

; Chdir to this directory at the start. This value must be an absolute path.
; Default Value: current directory or / when chroot
;chdir = /var/www

; Redirect worker stdout and stderr into main error log. If not set, stdout and
; stderr will be redirected to /dev/null according to FastCGI specs.
; Default Value: no
;catch_workers_output = yes

; Limits the extensions of the main script FPM will allow to parse. This can
; prevent configuration mistakes on the web server side. You should only limit
; FPM to .php extensions to prevent malicious users to use other extensions to
; exectute php code.
; Note: set an empty value to allow all extensions.
; Default Value: .php
;security.limit_extensions = .php .php3 .php4 .php5

; Pass environment variables like LD_LIBRARY_PATH. All $VARIABLEs are taken from
; the current environment.
; Default Value: clean env
;env[HOSTNAME] = $HOSTNAME
;env[PATH] = /usr/local/bin:/usr/bin:/bin
;env[TMP] = /tmp
;env[TMPDIR] = /tmp
;env[TEMP] = /tmp

; Additional php.ini defines, specific to this pool of workers. These settings
; overwrite the values previously defined in the php.ini. The directives are the
; same as the PHP SAPI:
    ;   php_value/php_flag             - you can set classic ini defines which can
;                                    be overwritten from PHP call 'ini_set'.
;   php_admin_value/php_admin_flag - these directives won't be overwritten by
;                                     PHP call 'ini_set'
; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no.

; Defining 'extension' will load the corresponding shared extension from
; extension_dir. Defining 'disable_functions' or 'disable_classes' will not
; overwrite previously defined php.ini values, but will append the new value
; instead.

; Default Value: nothing is defined by default except the values in php.ini and
;                specified at startup with the -d argument
;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f [email protected]
php_flag[display_errors] = off
php_admin_value[error_log] = /var/log/php-fpm/www-error.log
php_admin_flag[log_errors] = on
php_admin_value[memory_limit] = 256M

; Set session path to a directory owned by process user
;php_value[session.save_handler] = files
;php_value[session.save_path] = /var/lib/php/session

答案1

您在 nginx 中为上游 PHP-FPM 使用least_conn负载平衡策略。这意味着,对于一个 IP 地址上的一个用户,可能会由不同的 PHP-FPM 进程提供服务。

如果这两个 PHP-FPM 进程之间不共享有关用户的所有必要状态,那么可能会发生奇怪的事情。例如,如果用户会话状态位于 PHP-FPM 节点本地,那么用户在登录状态下一旦访问另一台服务器就会被注销。

为了避免这种情况,您需要用 替换least_connip_hash这可确保来自一个 IP 地址的所有连接都发送到同一个 PHP-FPM 节点。理论上,这会使负载平衡变得有点不均衡,但实际上没有区别。

但这可能不是您所遇到的问题的原因。

答案2

我认为删除 keepalive 会向您显示潜在的问题,我怀疑这是您配置的超时和后端在负载下的响应程度的组合。

更具体地说,我认为这是你的问题:

upstream php_fpm {
    least_conn;
    server 127.0.0.1:9000 max_fails=3 fail_timeout=15s;
    keepalive 8;
}

我会尝试以下操作:

upstream php_fpm {
    least_conn;
    server 127.0.0.1:9000 max_fails=3 fail_timeout=60s;
    keepalive 8;
}

我认为发生的情况是,PHP-FPM 设置为在 60 秒后终止处理,但 nginx 在 15 秒后认为请求失败。

https://nginx.org/en/docs/http/ngx_http_upstream_module.html#server

fail_timeout=time 设定在多长的时间内,尝试与服务器通信失败的次数达到指定次数,服务器才会被视为不可用;在这段时间内,服务器将被视为不可用。默认情况下,该参数设置为 10 秒。

您可能还想检查您的峰值负载是什么样的,并研究扩展您的后端来吸收它。

相关内容