我们有一个 AWS Scaling Group,其中包含多个 EC2 实例,这些实例的 CPU 利用率永远不会超过 60%,我们也不知道原因。我们还有其他具有类似功能和构建的服务器,它们没有这个问题。
所有实例都在一个 M4 大型实例上,运行 Rails 应用程序并像我们的大多数其他服务器一样处理 Sidekiq。我们尝试比较那些使用完整 CPU 的实例和这个不是为了想法的实例的重要配置(nginx、puma、sidekiq、rails),但我们仍然不知所措。这个 ASG 运行 Kafka,而其他的没有,但这是唯一的主要区别。
我提前道歉,因为我知道这是一个非常模糊的问题,下面是htop
$
1 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 50.3%] Tasks: 40, 72 thr; 3 running
2 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 58.2%] Load average: 1.99 1.54 0.91
Mem[||||||||||||||||||||||||||| 1180/7984MB] Uptime: 00:19:02
Swp[ 0/0MB]
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
1153 ubuntu 20 0 1322M 634M 7732 S 104. 8.0 13:23.69 sidekiq 4.2.9 company-cs [19 of 20 busy]
7821 ubuntu 20 0 1322M 634M 7732 S 20.9 8.0 0:23.71 sidekiq 4.2.9 company-cs [19 of 20 busy]
7337 ubuntu 20 0 1322M 634M 7732 S 16.9 8.0 0:25.71 sidekiq 4.2.9 company-cs [19 of 20 busy]
15548 ubuntu 20 0 1322M 634M 7732 S 12.2 8.0 0:04.56 sidekiq 4.2.9 company-cs [19 of 20 busy]
7351 ubuntu 20 0 1322M 634M 7732 S 10.1 8.0 0:23.19 sidekiq 4.2.9 company-cs [19 of 20 busy]
1573 ubuntu 20 0 1322M 634M 7732 S 10.1 8.0 0:39.57 sidekiq 4.2.9 company-cs [19 of 20 busy]
1578 ubuntu 20 0 1322M 634M 7732 S 9.5 8.0 0:38.50 sidekiq 4.2.9 company-cs [19 of 20 busy]
1570 ubuntu 20 0 1322M 634M 7732 S 4.1 8.0 0:41.56 sidekiq 4.2.9 company-cs [19 of 20 busy]
1579 ubuntu 20 0 1322M 634M 7732 R 4.1 8.0 0:39.75 sidekiq 4.2.9 company-cs [19 of 20 busy]
1390 ubuntu 20 0 1155M 177M 6888 S 3.4 2.2 0:41.17 puma: cluster worker 1: 1078 [company-cs]
7455 ubuntu 20 0 1322M 634M 7732 S 3.4 8.0 0:24.58 sidekiq 4.2.9 company-cs [19 of 20 busy]
1333 ubuntu 20 0 622M 31324 2632 S 2.7 0.4 0:29.24 kafka_consumer-c=/home/ubuntu/company-cs/config/kafka.yml -d
8491 ubuntu 20 0 1322M 634M 7732 S 2.0 8.0 0:20.92 sidekiq 4.2.9 company-cs [19 of 20 busy]
13051 ubuntu 20 0 1322M 634M 7732 S 2.0 8.0 0:11.14 sidekiq 4.2.9 company-cs [19 of 20 busy]
1580 ubuntu 20 0 1322M 634M 7732 S 2.0 8.0 0:38.74 sidekiq 4.2.9 company-cs [19 of 20 busy]
1428 ubuntu 20 0 622M 31324 2632 S 2.0 0.4 0:06.85 kafka_consumer-c=/home/ubuntu/company-cs/config/kafka.yml -d
1584 ubuntu 20 0 1322M 634M 7732 S 1.4 8.0 0:38.88 sidekiq 4.2.9 company-cs [19 of 20 busy]
1386 ubuntu 20 0 1220M 185M 6948 S 1.4 2.3 0:41.74 puma: cluster worker 0: 1078 [company-cs]
7830 ubuntu 20 0 1322M 634M 7732 S 1.4 8.0 0:24.69 sidekiq 4.2.9 company-cs [19 of 20 busy]
1575 ubuntu 20 0 1322M 634M 7732 S 1.4 8.0 0:38.60 sidekiq 4.2.9 company-cs [19 of 20 busy]
1582 ubuntu 20 0 1322M 634M 7732 S 1.4 8.0 0:39.32 sidekiq 4.2.9 company-cs [19 of 20 busy]
13166 ubuntu 20 0 1322M 634M 7732 S 1.4 8.0 0:10.51 sidekiq 4.2.9 company-cs [19 of 20 busy]
15549 ubuntu 20 0 1322M 634M 7732 S 1.4 8.0 0:04.15 sidekiq 4.2.9 company-cs [19 of 20 busy]
1571 ubuntu 20 0 1322M 634M 7732 S 1.4 8.0 0:39.60 sidekiq 4.2.9 company-cs [19 of 20 busy]
17297 ubuntu 20 0 1155M 177M 6888 S 1.4 2.2 0:00.36 puma: cluster worker 1: 1078 [company-cs]
17618 ubuntu 20 0 1155M 177M 6888 S 1.4 2.2 0:00.25 puma: cluster worker 1: 1078 [company-cs]
14791 ubuntu 20 0 1220M 185M 6948 S 1.4 2.3 0:01.75 puma: cluster worker 0: 1078 [company-cs]
1567 ubuntu 20 0 1322M 634M 7732 S 0.7 8.0 0:41.17 sidekiq 4.2.9 company-cs [19 of 20 busy]
15347 ubuntu 20 0 1155M 177M 6888 S 0.7 2.2 0:01.93 puma: cluster worker 1: 1078 [company-cs]
15348 ubuntu 20 0 1220M 185M 6948 S 0.7 2.3 0:01.37 puma: cluster worker 0: 1078 [company-cs]
1431 www-data 20 0 86436 3368 1760 S 0.7 0.0 0:02.04 nginx: worker process
1408 ubuntu 20 0 622M 31324 2632 S 0.7 0.4 0:06.92 kafka_consumer-c=/home/ubuntu/company-cs/config/kafka.yml -d
1412 root 20 0 33936 26736 5012 S 0.0 0.3 0:12.75 /usr/share/filebeat/bin/filebeat -c /etc/filebeat/filebeat.yml -path.home /usr/share/filebeat -path.config /etc/filebeat -path.data /var/lib/filebeat -path.logs /var/log/filebeat
1522 root 20 0 33936 26736 5012 S 0.0 0.3 0:02.33 /usr/share/filebeat/bin/filebeat -c /etc/filebeat/filebeat.yml -path.home /usr/share/filebeat -path.config /etc/filebeat -path.data /var/lib/filebeat -path.logs /var/log/filebeat
1425 root 20 0 33936 26736 5012 S 0.0 0.3 0:01.21 /usr/share/filebeat/bin/filebeat -c /etc/filebeat/filebeat.yml -path.home /usr/share/filebeat -path.config /etc/filebeat -path.data /var/lib/filebeat -path.logs /var/log/filebeat
16111 ubuntu 20 0 1220M 185M 6948 S 0.0 2.3 0:00.85 puma: cluster worker 0: 1078 [company-cs]
17809 ubuntu 20 0 25852 3352 1448 R 0.0 0.0 0:00.09 htop
1394 ubuntu 20 0 622M 31324 2632 S 0.0 0.4 0:06.94 kafka_consumer-c=/home/ubuntu/company-cs/config/kafka.yml -d
1396 ubuntu 20 0 622M 31324 2632 S 0.0 0.4 0:07.73 kafka_consumer-c=/home/ubuntu/company-cs/config/kafka.yml -d
1 root 20 0 33620 2896 1484 S 0.0 0.0 0:02.89 /sbin/init
1424 root 20 0 33936 26736 5012 S 0.0 0.3 0:02.55 /usr/share/filebeat/bin/filebeat -c /etc/filebeat/filebeat.yml -path.home /usr/share/filebeat -path.config /etc/filebeat -path.data /var/lib/filebeat -path.logs /var/log/filebeat
1523 root 20 0 33936 26736 5012 S 0.0 0.3 0:02.35 /usr/share/filebeat/bin/filebeat -c /etc/filebeat/filebeat.yml -path.home /usr/share/filebeat -path.config /etc/filebeat -path.data /var/lib/filebeat -path.logs /var/log/filebeat
1295 ubuntu 20 0 1322M 634M 7732 S 0.0 8.0 0:00.18 sidekiq 4.2.9 company-cs [19 of 20 busy]
1592 ubuntu 20 0 1220M 185M 6948 S 0.0 2.3 0:00.40 puma: cluster worker 0: 1078 [company-cs]
1415 root 20 0 33936 26736 5012 S 0.0 0.3 0:01.67 /usr/share/filebeat/bin/filebeat -c /etc/filebeat/filebeat.yml -path.home /usr/share/filebeat -path.config /etc/filebeat -path.data /var/lib/filebeat -path.logs /var/log/filebeat
421 root 20 0 19488 652 460 S 0.0 0.0 0:00.09 upstart-udev-bridge --daemon
427 root 20 0 50000 1964 1000 S 0.0 0.0 0:00.23 /lib/systemd/systemd-udevd --daemon
549 root 20 0 15272 400 192 S 0.0 0.0 0:00.03 upstart-socket-bridge --daemon
597 root 20 0 10232 2412 116 S 0.0 0.0 0:00.00 dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases eth0
833 messagebu 20 0 39224 1216 848 S 0.0 0.0 0:00.02 dbus-daemon --system --fork
886 root 20 0 43460 1764 1424 S 0.0 0.0 0:00.00 /lib/systemd/systemd-logind
F1Help F2Setup F3SearchF4FilterF5Tree F6SortByF7Nice -F8Nice +F9Kill F10Quit
更新
$ iostat -x
Linux 3.13.0-100-generic (ip-172-31-16-77) 11/27/2017 _x86_64_ (2 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
25.58 0.01 0.66 0.18 0.11 73.46
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.05 3.08 0.89 2.07 18.23 121.32 94.07 0.03 11.01 9.70 11.58 1.31 0.39
更新 2