我在 ruby on rails 应用程序上遇到了非常高的 CPU 问题(请参阅下面的堆栈),并一直尝试诊断可能的原因但无济于事。
堆:
- ruby 1.9.3
- rails 3.2.6
- Apache/2.2.21(Debian)
- Phusion Passenger 3.0.11
每当我遇到strace
Rack 进程 PID 激增的情况时(参见下面的摘录),我接到了大量这样的stat("/etc/localtime")
电话clock_gettime(CLOCK_REALTIME)
,却不知道该如何阻止这些电话。
摘录自 Top showin 正在运行的 PID:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11674 www-user 20 0 313m 182m 5076 R 99 2.3 63:04.60 Rack: /var/www/my_rails_app/current 11634 www-user 20 0 411m 216m 5144 S 10 2.7 197:55.63 Rack: /var/www/my_rails_app/current
Strace 代码片段如下:
[pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 141474018}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 141577456}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 143073982}) = 0 [pid 11674] poll([{fd=15, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) [pid 11674] write(15, "b\0\0\0\3SELECT `images`.* FROM `ima"..., 102) = 102 [pid 11674] read(15, "\1\0\0\1\0229\0\0\2\3def\23myappy_productio"..., 16384) = 2063 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 144138035}) = 0 [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 ... [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 154076443}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 154189429}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 157185700}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 157298770}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 165076003}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 165212572}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 167542679}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354058955, 167683436}) = 0 .... [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 62052248}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 62182486}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 62919948}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 63057266}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 63751707}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 73730686}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 75874687}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 76077133}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 78205019}) = 0 ... [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 89370879}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 89583247}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 91637614}) = 0 [pid 11674] clock_gettime(CLOCK_REALTIME, {1354060036, 91782149}) = 0
我用 Google 搜索了一下,得到了很多建议,但是都试过了,都没有成功。
到目前为止尝试过的事情:
已尝试按照建议设置时区这里
没什么区别,问题仍然存在。
我的 /etc/localtime 的内容:TZif2UTCTZif2UTC
UTC0已尝试建议的闰秒错误修复方法:
日期 -s ‘日期’
至今还没有什么快乐。
我没有什么主意,因此如果能提供任何关于如何诊断或解决的帮助/建议我将非常感激。
答案1
导出 TZ=:/etc/localtime也可以工作 - 它将在启动时读取该文件并且永远不会再读取 - 这意味着如果您更改此文件的内容,则需要重新启动守护进程。
但是,与您一样,我们也在所有服务器上运行 UTC,因此它永远不会改变。
无法帮助您解决“clock_gettime”问题 - 不过,我想说的是,在我们发现的虚拟机上时间()这样做代价非常昂贵,所以我们有一个守护进程,它分配一些共享内存,并将时间存入其中,然后所有想知道时间的进程都可以连接到共享内存并读取它,而不是使用时间()脚注。
答案2
我发现 /etc/localtime 的过多统计数据是由于缺少环境变量造成的。
尝试这个:
echo $TZ
如果为空,则在正确的位置设置变量(即 /home/apache/.bash_profile)。您需要为负责运行 Web 服务器的用户设置它,然后重新加载守护进程(apachectl graceful 等)。
TZ='Europe/London'; export TZ
或者您所在地区的正确时区(http://en.wikipedia.org/wiki/List_of_tz_database_time_zones)。