摘录自 Top showin 正在运行的 PID:

摘录自 Top showin 正在运行的 PID:


我在 ruby​​ on rails 应用程序上遇到了非常高的 CPU 问题(请参阅下面的堆栈),并一直尝试诊断可能的原因但无济于事。

堆:

  • ruby 1.9.3
  • rails 3.2.6
  • Apache/2.2.21(Debian)
  • Phusion Passenger 3.0.11

每当我遇到straceRack 进程 PID 激增的情况时(参见下面的摘录),我接到了大量这样的stat("/etc/localtime")电话clock_gettime(CLOCK_REALTIME),却不知道该如何阻止这些电话。


摘录自 Top showin 正在运行的 PID:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
11674 www-user  20   0  313m 182m 5076 R   99  2.3  63:04.60 Rack: /var/www/my_rails_app/current
11634 www-user  20   0  411m 216m 5144 S   10  2.7 197:55.63 Rack: /var/www/my_rails_app/current


Strace 代码片段如下:

[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 141474018}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 141577456}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 143073982}) = 0
[pid 11674] poll([{fd=15, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
[pid 11674] write(15, "b\0\0\0\3SELECT `images`.* FROM `ima"..., 102) = 102
[pid 11674] read(15, "\1\0\0\1\0229\0\0\2\3def\23myappy_productio"..., 16384) = 2063
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 144138035}) = 0
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
...
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 154076443}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 154189429}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 157185700}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 157298770}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 165076003}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 165212572}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 167542679}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 167683436}) = 0
....
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 62052248}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 62182486}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 62919948}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 63057266}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 63751707}) = 0
 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 73730686}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 75874687}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 76077133}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 78205019}) = 0
...
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 89370879}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 89583247}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 91637614}) = 0
[pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 91782149}) = 0


我用 Google 搜索了一下,得到了很多建议,但是都试过了,都没有成功。


到目前为止尝试过的事情:

  1. 已尝试按照建议设置时区这里
    没什么区别,问题仍然存在。
    我的 /etc/localtime 的内容:

    TZif2UTCTZif2UTC
    UTC0

  2. 已尝试建议的闰秒错误修复方法:

    日期 -s ‘日期’


至今还没有什么快乐。


我没有什么主意,因此如果能提供任何关于如何诊断或解决的帮助/建议我将非常感激。


答案1

导出 TZ=:/etc/localtime也可以工作 - 它将在启动时读取该文件并且永远不会再读取 - 这意味着如果您更改此文件的内容,则需要重新启动守护进程。

但是,与您一样,我们也在所有服务器上运行 UTC,因此它永远不会改变。

无法帮助您解决“clock_gettime”问题 - 不过,我想说的是,在我们发现的虚拟机上时间()这样做代价非常昂贵,所以我们有一个守护进程,它分配一些共享内存,并将时间存入其中,然后所有想知道时间的进程都可以连接到共享内存并读取它,而不是使用时间()脚注。

答案2

我发现 /etc/localtime 的过多统计数据是由于缺少环境变量造成的。

尝试这个:

echo $TZ

如果为空,则在正确的位置设置变量(即 /home/apache/.bash_profile)。您需要为负责运行 Web 服务器的用户设置它,然后重新加载守护进程(apachectl graceful 等)。

TZ='Europe/London'; export TZ

或者您所在地区的正确时区(http://en.wikipedia.org/wiki/List_of_tz_database_time_zones)。

相关内容