Dell R720 服务器登录时进程运行缓慢

Dell R720 服务器登录时进程运行缓慢

我们购买了一台 Dell R720 服务器,并在其上安装了 CentOS 6.3。我们还有另一台较旧的 Dell 服务器,其中也安装了 CentOS 6.3。当我们对磁盘性能进行简单的基准测试时,旧服务器的基准测试速度比新服务器快 10 倍。基准测试过程涉及将某些内容写入磁盘并循环刷新。我们想找出速度慢的原因。新服务器中有两个磁盘,我们将它们配置为 RAID-0。df -h 产生以下内容:

[Older  server]

[xxx@xxx ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 97G 28G 64G 31% /
tmpfs 1.9G 11M 1.9G 1% /dev/shm
/dev/sda2 193G 103G 80G 57% /home

[New server]

[root@snap ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2       116G  664M  109G   1% /
tmpfs            12G     0   12G   0% /dev/shm
/dev/sda1       7.7G  195M  7.2G   3% /boot
/dev/sdb2        77G  192M   73G   1% /home
/dev/sdb1       154G  232M  146G   1% /tmp
/dev/sda3        77G  2.4G   71G   4% /usr

如何找出导致新服务器速度慢 10 倍的原因?如何修复?谢谢。

编辑:添加 lshw 输出。

[Older Server]
[duminda@snapoffice src]$ sudo ./lshw -class storage
[sudo] password for duminda: 
  *-storage               
       description: Serial Attached SCSI controller
       product: SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]
       vendor: LSI Logic / Symbios Logic
       physical id: 0
       bus info: pci@0000:05:00.0
       logical name: scsi0
       version: 03
       width: 64 bits
       clock: 33MHz
       capabilities: storage pm pciexpress vpd msi msix bus_master cap_list rom
       configuration: driver=mpt2sas latency=0
       resources: irq:16 ioport:fc00(size=256) memory:df2b0000-df2bffff memory:df2c0000-df2fffff memory:df100000-df1fffff(prefetchable)
  *-ide:0
       description: IDE interface
       product: 5 Series/3400 Series Chipset 4 port SATA IDE Controller
       vendor: Intel Corporation
       physical id: 1f.2
       bus info: pci@0000:00:1f.2
       version: 05
       width: 32 bits
       clock: 66MHz
       capabilities: ide pm bus_master cap_list
       configuration: driver=ata_piix latency=0
       resources: irq:20 ioport:eca0(size=8) ioport:ec90(size=4) ioport:eca8(size=8) ioport:ec94(size=4) ioport:ecc0(size=16) ioport:ecd0(size=16)
  *-ide:1
       description: IDE interface
       product: 5 Series/3400 Series Chipset 2 port SATA IDE Controller
       vendor: Intel Corporation
       physical id: 1f.5
       bus info: pci@0000:00:1f.5
       logical name: scsi3
       version: 05
       width: 32 bits
       clock: 66MHz
       capabilities: ide pm bus_master cap_list emulated
       configuration: driver=ata_piix latency=0
       resources: irq:21 ioport:ecb0(size=8) ioport:ec98(size=4) ioport:ecb8(size=8) ioport:ec9c(size=4) ioport:ece0(size=16) ioport:ecf0(size=16)

[Newer Server]
[root@Snap src]# ./lshw -class storage
  *-storage               
       description: RAID bus controller
       product: MegaRAID SAS 2208 [Thunderbolt]
       vendor: LSI Logic / Symbios Logic
       physical id: 0
       bus info: pci@0000:03:00.0
       logical name: scsi0
       version: 05
       width: 64 bits
       clock: 33MHz
       capabilities: storage pm pciexpress vpd msi msix bus_master cap_list rom
       configuration: driver=megaraid_sas latency=0
       resources: irq:42 ioport:fc00(size=256) memory:ddffc000-ddffffff memory:ddf80000-ddfbffff memory:dd000000-dd01ffff(prefetchable)
  *-storage
       description: SATA controller
       product: C600/X79 series chipset 6-Port SATA AHCI Controller
       vendor: Intel Corporation
       physical id: 1f.2
       bus info: pci@0000:00:1f.2
       logical name: scsi5
       version: 05
       width: 32 bits
       clock: 66MHz
       capabilities: storage msi pm ahci_1.0 bus_master cap_list emulated
       configuration: driver=ahci latency=0
       resources: irq:124 ioport:ece8(size=8) ioport:ecf8(size=4) ioport:ecf0(size=8) ioport:ecfc(size=4) ioport:ecc0(size=32) memory:df8ff000-df8ff7ff

编辑:有关磁盘的更多信息:

[Older Server]
[duminda@snapoffice ~]$ find /sys/ -type f -name "model"
/sys/devices/pci0000:00/0000:00:05.0/0000:05:00.0/host0/target0:1:0/0:1:0:0/model
/sys/devices/pci0000:00/0000:00:05.0/0000:05:00.0/host0/port-0:0/end_device-0:0/target0:0:0/0:0:0:0/model
/sys/devices/pci0000:00/0000:00:05.0/0000:05:00.0/host0/port-0:1/end_device-0:1/target0:0:1/0:0:1:0/model
/sys/devices/pci0000:00/0000:00:1f.5/host3/target3:0:0/3:0:0:0/model
[duminda@snapoffice ~]$ cat /sys/devices/pci0000:00/0000:00:05.0/0000:05:00.0/host0/target0:1:0/0:1:0:0/model
Virtual Disk    
[duminda@snapoffice ~]$ cat /sys/devices/pci0000:00/0000:00:05.0/0000:05:00.0/host0/port-0:0/end_device-0:0/target0:0:0/0:0:0:0/model
ST500NM0001     
[duminda@snapoffice ~]$ cat /sys/devices/pci0000:00/0000:00:05.0/0000:05:00.0/host0/port-0:1/end_device-0:1/target0:0:1/0:0:1:0/model
ST500NM0001     
[duminda@snapoffice ~]$ cat /sys/devices/pci0000:00/0000:00:1f.5/host3/target3:0:0/3:0:0:0/model
DVD+-RW TS-L633J

谷歌搜索 ST500NM0001

Storage Capacity: 500 GB
Maximum External Data Transfer Rate: 600 MBps (4.7 Gbps)
Rotational Speed: 7200 rpm
Buffer: 64 MB
Drive Interface: SAS
Drive Interface Standard: 6Gb/s SAS
Drive Type: Internal
Drive Width: 3.5"
Height: 1"
Width: 4"
Depth: 5.8"
Weight (Approximate): 1.34 lb
Limited Warranty: 3 Year

然而,较新的服务器给出了这样的信息:

[Newer Server]
[root@Snap ~]# find /sys/ -type f -name "model"
/sys/devices/pci0000:00/0000:00:02.2/0000:03:00.0/host0/target0:2:0/0:2:0:0/model
/sys/devices/pci0000:00/0000:00:02.2/0000:03:00.0/host0/target0:2:1/0:2:1:0/model
/sys/devices/pci0000:00/0000:00:1f.2/host5/target5:0:0/5:0:0:0/model
[root@Snap ~]# cat /sys/devices/pci0000:00/0000:00:02.2/0000:03:00.0/host0/target0:2:0/0:2:0:0/model
PERC H710       
[root@Snap ~]# cat /sys/devices/pci0000:00/0000:00:02.2/0000:03:00.0/host0/target0:2:1/0:2:1:0/model
PERC H710       
[root@Snap ~]# cat /sys/devices/pci0000:00/0000:00:1f.2/host5/target5:0:0/5:0:0:0/model
DVD+-RW DS-8A9SH

编辑:新服务器有 2 个这样的驱动器:

300GB  15K RPM, 6Gbps   SAS 3.5" Hot Plug Hard Drive

编辑:新服务器中的 I/O 调度变化:

[snap@Snap ~]$ cat /sys/block/sda/queue/scheduler 
[noop] anticipatory deadline cfq 
[snap@Snap ~]$ cat /sys/block/sdb/queue/scheduler 
[noop] anticipatory deadline cfq 
[snap@Snap ~]$ time ./test_depth 

real    0m0.990s
user    0m0.239s
sys 0m0.352s
[snap@Snap ~]$ cat /sys/block/sda/queue/scheduler 
noop [anticipatory] deadline cfq 
[snap@Snap ~]$ cat /sys/block/sdb/queue/scheduler 
noop [anticipatory] deadline cfq 
[snap@Snap ~]$ time ./test_depth 

real    0m1.031s
user    0m0.172s
sys 0m0.444s
[snap@Snap ~]$ cat /sys/block/sda/queue/scheduler 
noop anticipatory [deadline] cfq 
[snap@Snap ~]$ cat /sys/block/sdb/queue/scheduler 
noop anticipatory [deadline] cfq 
[snap@Snap ~]$ time ./test_depth 

real    0m0.998s
user    0m0.150s
sys 0m0.448s
[snap@Snap ~]$ cat /sys/block/sda/queue/scheduler 
noop anticipatory deadline [cfq] 
[snap@Snap ~]$ cat /sys/block/sdb/queue/scheduler 
noop anticipatory deadline [cfq] 
[snap@Snap ~]$ time ./test_depth 

real    0m1.078s
user    0m0.228s
sys 0m0.350s
[snap@Snap ~]$ 

每个时间表运行一次可能不够。但似乎差别不大。

编辑:

重新安装了 CentOS,没有逻辑卷。只使用了 ext4 分区。性能仍然没有改善。

编辑:基准测试程序 - 非常简单。

  (run with these env vars)
  export GLOG_logbufsecs=0
  export GLOG_log_dir=/tmp

  ====================

  #include <glog/logging.h>
  #include <iostream>


  int main(int argc, char **argv)
  {
      google::InitGoogleLogging(argv[0]);

      for (int i = 0; i <100000; ++i)
      {
          DLOG(INFO) << "TEST";
      }

      return 0;
  }

CPU 信息

==========================================

[Old server CPU]
[duminda@snapoffice mdata]$ cat /proc/cpuinfo 
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 30
model name  : Intel(R) Xeon(R) CPU           X3430  @ 2.40GHz
stepping    : 5
cpu MHz     : 2393.786
cache size  : 8192 KB
physical id : 0
siblings    : 4
core id     : 0
cpu cores   : 4
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips    : 4787.57
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

...... 3 more like this

================================================
[New server CPUs]
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 62
model name  : Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
stepping    : 4
cpu MHz     : 1999.988
cache size  : 20480 KB
physical id : 0
siblings    : 16
core id     : 0
cpu cores   : 8
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c rdrand lahf_lm ida arat xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
bogomips    : 3999.97
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

........ 31 more like this

========================

bonnie++ 输出

[Old server]

[root@snapoffice bonnie++-1.03e]# time ./bonnie++ -n 0 -d /tmp/duminda -r 512 -b -u duminda
Using uid:511, gid:511.
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Version 1.03e       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
snapoffice       1G 54925  97 105195  24 123526   8 65593  99 +++++ +++ 384.3   0
snapoffice,1G,54925,97,105195,24,123526,8,65593,99,+++++,+++,384.3,0,,,,,,,,,,,,,

real    1m20.473s
user    0m33.528s
sys 0m4.819s

[New server]

[root@snap ~]# time bonnie++ -n 0 -d /tmp -r 512 -u snap -b
Using uid:500, gid:500.
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Version 1.03e       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
snap.R720        1G 86507  99 217958  31 187624  20 88467  99 +++++ +++ +++++ +++
snap.R720,1G,86507,99,217958,31,187624,20,88467,99,+++++,+++,+++++,+++,,,,,,,,,,,,,

real    0m40.172s
user    0m22.907s
sys 0m4.516s

============================================

记忆

[Old server]
[duminda@snapoffice mdata]$ cat /proc/meminfo 
MemTotal:        3913604 kB
MemFree:         1272208 kB
Buffers:          196168 kB
Cached:          1459716 kB
SwapCached:        73752 kB
Active:           867288 kB
Inactive:        1396600 kB
Active(anon):     325104 kB
Inactive(anon):   293588 kB
Active(file):     542184 kB
Inactive(file):  1103012 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       8191992 kB
SwapFree:        7683524 kB
Dirty:                80 kB
Writeback:             0 kB
AnonPages:        549976 kB
Mapped:            48912 kB
Shmem:             10684 kB
Slab:             247592 kB
SReclaimable:      86080 kB
SUnreclaim:       161512 kB
KernelStack:        7024 kB
PageTables:        79016 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    10148792 kB
Committed_AS:    7679752 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      285540 kB
VmallocChunk:   34359445552 kB
HardwareCorrupted:     0 kB
AnonHugePages:    204800 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        6756 kB
DirectMap2M:     4177920 kB

[New server]
[root@snap bonnie++-1.03e]# cat /proc/meminfo 
MemTotal:       24554684 kB
MemFree:        23312840 kB
Buffers:          217960 kB
Cached:           523140 kB
SwapCached:            0 kB
Active:           346236 kB
Inactive:         414888 kB
Active(anon):      20208 kB
Inactive(anon):       28 kB
Active(file):     326028 kB
Inactive(file):   414860 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      20479992 kB
SwapFree:       20479992 kB
Dirty:                 8 kB
Writeback:             0 kB
AnonPages:         20032 kB
Mapped:            14532 kB
Shmem:               220 kB
Slab:             163140 kB
SReclaimable:      86032 kB
SUnreclaim:        77108 kB
KernelStack:        6320 kB
PageTables:         3544 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    32757332 kB
Committed_AS:     120740 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      338928 kB
VmallocChunk:   34346663592 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        5056 kB
DirectMap2M:     2045952 kB
DirectMap1G:    23068672 kB

编辑:添加悬赏

我想知道为什么我的测试程序在新服务器上运行缓慢以及如何修复它(不删除 glog 的使用 - 因为我们的程序大量使用它)。也许 Matthew 的答案指向正确的方向?

如果您需要更多信息,请告诉我...

答案1

基准测试过程涉及将某些内容写入磁盘并循环刷新

嗯,不是的,编译和运行它表明它根本没有刷新这些数据,所以有些不对劲。

我使用您指定的选项和使用的环境变量从 strace 获得以下内容。

open("/tmp/glog.home.localdomain.matthew.log.INFO.20140213-224037.24470", O_WRONLY|O_CREAT|O_EXCL, 0664) = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
fcntl(3, F_GETFL)                       = 0x8001 (flags O_WRONLY|O_LARGEFILE)
fcntl(3, F_SETFL, O_WRONLY|O_APPEND|O_LARGEFILE) = 0
...

这里没有打开的标志表明它将把它刷新到磁盘......

写一下怎么样,

write(3, "I0213 22:40:37.370820 24470 glog"..., 46) = 46
fadvise64(3, 0, 4096, POSIX_FADV_DONTNEED) = 0
gettid()                                = 24470
write(3, "I0213 22:40:37.370925 24470 glog"..., 46) = 46
fadvise64(3, 0, 4096, POSIX_FADV_DONTNEED) = 0
gettid()                                = 24470
write(3, "I0213 22:40:37.370987 24470 glog"..., 46) = 46
fadvise64(3, 0, 4096, POSIX_FADV_DONTNEED) = 0
gettid()                                = 24470
...

那里也没有冲洗。

可能在新系统上,在获得 POSIX_FADV_DONTNEED 后,它确实会将脏页写入磁盘,而在旧系统上则不会。这需要从测试中消除,并使测试更加公平,以避免与两个系统如何处理磁盘相关的任何可能问题。

请注意,使用fadvise这种方式是愚蠢的,每次都询问 pid 而不是保存它也是愚蠢的,但我离题了..

现在,老实说,我期望结果与你之前看到的类似;但这是一次实际上将要刷新到磁盘,与您的应用程序所做的不同。

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
#include <string.h>
#include <sysexits.h>
#include <err.h>
#include <limits.h>

/* Deliberate choice here to force writes 4 times to the same sector */
#define BUFSZ 128

int main() {
  char buf[BUFSZ];
  char path[PATH_MAX];
  char name[NAME_MAX];
  char *home = NULL;
  int fd, i;
  memset(name, 0, NAME_MAX);
  memset(path, 0, PATH_MAX);
  memset(buf, 'A', BUFSZ);
  buf[BUFSZ-1] = '\n';

  /* Figure out some useful path to put this */
  home = getenv("HOME");
  if (!path)
    errx(EX_SOFTWARE, "No HOME environment variable set. I give in!");

  /* Dont use this without using open with O_EXCL! */
  strcpy(name, "writethis.bin.XXXXXX");
  mktemp(name);

  snprintf(path, PATH_MAX, "%s/%s", home, name);

  /* Open the file with flushy flags */
  fd = open(path, O_WRONLY|O_APPEND|O_EXCL|O_CREAT|O_SYNC,
                                          S_IRUSR|S_IWUSR);
  if (fd < 0)
    err(EX_OSERR, "Cannot open file");

  /* Just need an inode, dont want it showing up in VFS.. */
  if (unlink(path) < 0)
    err(EX_OSERR, "Unlink failed. Something horrible probably happened");

  /* Dont pollute cache */
  if (posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED))
    err(EX_OSERR, "Fadvise failed?!");

  /* Write */    
  for (i=0; i < 1000; i++) {
    if (write(fd, buf, BUFSZ) < 0)
      err(EX_OSERR, "Cannot write to file");
  }

  close(fd);
}

请运行此程序time并给出结果。

答案2

您能否提供有关基准测试的更多详细信息?您的基准测试是否反映了实际的应用程序使用场景?如果没有,那么它可能不是衡量磁盘性能的最佳方法。不过,还有其他事情需要考虑...

  • 文件系统的选择和调整。
  • 挂载参数(noatime、nobarriers 等)。
  • 原始分区与 LVM。
  • 分区对齐。
  • I/O 调度程序。
  • RAID 控制器缓存设置。
  • 磁盘类型。

在文件系统和调优方面,EL6.2+ 具有tuned-adm 框架它可帮助您为硬件设置一些基本的 I/O 性能参数。这包括设置截止期限 I/O 调度程序和在合理的情况下禁用写入屏障。对于您的新系统,您需要运行yum install tuned tuned-utils并运行tuned-adm profile enterprise-storage

对于您的分区方案,您的旧服务器具有固定分区大小,而新系统具有 LVM。LVM 是另一个抽象层,可能在本案中产生影响。

请确保您的 RAID 缓存配置正确。您通常希望缓存偏向于写入。

了解新服务器中有哪些磁盘...但这可能并不重要,因为旧服务器的磁盘是速度最慢的企业近线 SAS 磁盘。因此,新系统可能具有性能高于或等于旧服务器磁盘的驱动器。

答案3

在几乎所有情况下,我们的戴尔服务器性能问题都与所使用的 RAID 卡有关。他们销售的一些卡的 Linux 性能极差。

参见http://hwraid.le-vert.net/wiki/LSIMegaRAIDSAS 对阵http://hwraid.le-vert.net/wiki/LSIFusionMPTSAS2

比较一下这些机器上的两块显卡。后者是低端显卡,可能根本没有硬件加速功能,而前者是极好的高端显卡,具有良好的板载硬件。

答案4

正如许多人所说的那样,您可能需要确保您的测试实际上测量的是您认为的内容 - 问题很可能在于测试本身,并且感知到的延迟或缓慢可能是由于 I/O 在到达磁盘之前就被绑定在内核/操作系统中。

您可以考虑查看戴尔关于第 12 代服务器 BIOS 设置中性能调整的文档 - 大多数人都不知道这个文档的存在,但您会对它所能带来的不同感到惊讶。

http://en.community.dell.com/cfs-file.ashx/__key/telligent-evolution-components-attachments/13-4491-00-00-20-24-87-40/12g_5F00_bios_5F00_tuning_5F00_for_5F00_performance_5F00_power.pdf

相关内容