我有一台在 Gunicorn 和 Nginx 上运行 Django 的服务器,昨晚我遇到了短暂的停机。重新启动 Nginx 和 Gunicorn 后,服务器恢复正常,但我无法找出停机的原因。检查日志概述了问题发生的位置:首先,大约有 100 行这样的内容:
2014/03/04 15:48:47 [emerg] 21790#0: *19536658 posix_memalign(16, 4096) failed (12: Cannot allocate memory), client: xx.xx.xx.xx, server: www.mysite.com, request: "GET /static/images/loading.gif HTTP/1.1", host: "www.mysite.com"
然后实际的错误出现了。大约有 300 个这样的错误:
2014/03/04 15:49:04 [error] 21790#0: *19532341 connect() failed (110: Connection timed out) while connecting to upstream, client: xx.xx.xx.xx, server: www.mysite.com, request: "GET / HTTP/1.1", upstream: "http://127.0.0.1:8081/", host: "www.mysite.com"
...其中约有 100 个:
2014/03/04 15:51:32 [error] 21789#0: *19529583 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: xx.xx.xx.xx, server: www.mysite.com, request: "GET / HTTP/1.1", upstream: "http://127.0.0.1:8081/", host: "www.mysite.com"
...以及日志中散布的一些内容:
2014/03/04 15:51:22 [emerg] 21791#0: *19539287 malloc(1024) failed (12: Cannot allocate memory) while waiting for request, client: xx.xx.xx.xx, server: 0.0.0.0:80
至于 Django 错误日志,我收到了很多错误,并显示以下消息:
OperationalError: could not fork new process for connection: Cannot allocate memory
这看起来像是某种内存不足错误,但当时我的交换空间完全是空的,而且我在任何日志中都没有看到任何表明任何进程被终止的信息。有人能解释一下这里可能发生了什么吗?