如何确定 CPU 性能不佳的原因 - AMD 75F3?

如何确定 CPU 性能不佳的原因 - AMD 75F3?

我有两台配置完全相同的机器,但其中一台机器的性能比另一台差很多。经过一番挖掘后,发现一台机器上的 CPU 比另一台机器上的 CPU 慢,这可能是什么问题?

机器 X 表现良好,上面还运行着大量的东西。

两个 75F3 CPU 超线程:

# cat /proc/cpuinfo | grep 'AMD EPYC' | uniq
model name  : AMD EPYC 75F3 32-Core Processor
# cat /proc/cpuinfo | grep 'AMD EPYC' | wc -l
128

乌班图20.04:

# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.4 LTS
Release:    20.04
Codename:   focal
# uname -a
Linux X 5.4.0-91-generic #102-Ubuntu SMP Fri Nov 5 16:31:28 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

CPU工作频率:

# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor | uniq
performance
# for i in {0..127} ; do cpufreq-info -f -c $i; done | tr '\n' ' '
3842008 3842393 3810377 3847464 3762634 3724354 3856476 3866478 3853858 3876690 3110999 3775277 3897576 3897856 3866237 3890866 3112874 3890274 3848609 3887705 3891703 3891721 3891739 3889757 3891764 3891630 3891859 3891608 3891414 3891912 3890534 3889927 3893343 3801748 3851380 3892385 3893213 3893485 3846021 3890994 3893815 3814882 3845462 3873715 3890898 3863111 3888041 3874408 3886703 3886258 3840979 3888331 3878444 3765557 3874062 3879276 3885420 3878773 3884834 3817770 3864404 3878672 3879196 3879807 3888213 3889747 3882999 3886457 3885751 3889096 3883811 3881295 3850021 3856605 3857041 3852233 3864256 3882260 3844967 3883135 3886052 3885600 3869050 3883402 3855283 3857940 3883602 3887286 3879753 3880053 3882848 3882790 3874810 3880445 3857427 3874503 3670686 3889217 3886244 3818280 3841524 3792650 3855806 3818744 3856255 3613503 3866995 3882701 3878329 3867638 3886501 3799640 3884642 3883070 3870675 3881828 3861174 3632788 3701190 3734291 3866216 3843889 3877786 3859997 3838879 3880956 3891517 3885391

它表现良好:

# sysbench --threads=128 cpu run | grep 'events per second'
    events per second: 291531.81
# sysbench --threads=64 cpu run | grep 'events per second'
    events per second: 219865.14
# sysbench --threads=32 cpu run | grep 'events per second'
    events per second: 131701.15
# sysbench --threads=16 cpu run | grep 'events per second'
    events per second: 68681.65
# sysbench --threads=1 cpu run | grep 'events per second'
    events per second:  4392.76

确实是非常一致的:

# sysbench --threads=1 cpu run | grep 'events per second'
    events per second:  4389.98
# sysbench --threads=1 cpu run | grep 'events per second'
    events per second:  4348.62
# sysbench --threads=1 cpu run | grep 'events per second'
    events per second:  4420.06

完整细节:

# sysbench --threads=1 cpu run 
sysbench 1.0.18 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time

Prime numbers limit: 10000

Initializing worker threads...

Threads started!

CPU speed:
    events per second:  4437.62

General statistics:
    total time:                          10.0003s
    total number of events:              44380

Latency (ms):
         min:                                    0.22
         avg:                                    0.23
         max:                                    0.40
         95th percentile:                        0.23
         sum:                                 9994.09

Threads fairness:
    events (avg/stddev):           44380.0000/0.00
    execution time (avg/stddev):   9.9941/0.00

机器A表现不佳,(上面没有运行任何东西)。

两个 75F3 CPU 超线程:

# cat /proc/cpuinfo | grep 'AMD EPYC' | uniq
model name  : AMD EPYC 75F3 32-Core Processor
# cat /proc/cpuinfo | grep 'AMD EPYC' | wc -l
128

Ubuntu 20.04(虽然内核稍新):

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.4 LTS
Release:    20.04
Codename:   focal
uname -a
Linux A 5.4.0-109-generic #123-Ubuntu SMP Fri Apr 8 09:10:54 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

核心也运行得很快:

# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor | uniq
performance
# for i in {0..127} ; do cpufreq-info -f -c $i; done | tr '\n' ' '
3792504 3792625 3792921 3792063 3789019 3788851 3792581 3792453 3789152 3790125 3792243 3792418 3792430 3790784 3792419 3791023 3789898 3792592 3789838 3792114 3792516 3792412 3792426 3792738 3792582 3791124 3791205 3788344 3792380 3792250 3792582 3791047 3811723 3809352 3812555 3809227 3817034 3815969 3810839 3804632 3810456 3807646 3809023 3808747 3806460 3808462 3814759 3809964 3820911 3810358 3814496 3812186 3813329 3814405 3814068 3810848 3813043 3809750 3813568 3813902 3809526 3812241 3815648 3808690 3790942 3789219 3791206 3792433 3792086 3791228 3792924 3789743 3790897 3790870 3792924 3789337 3789202 3790756 3792417 3792425 3792185 3792919 3792504 3792626 3792758 3792641 3792502 3792734 3792095 3792924 3792587 3792587 3792505 3792508 3792760 3792585 3814533 3809134 3813383 3809371 3810692 3807309 3815974 3813425 3814072 3816579 3814508 3814330 3812141 3808752 3815138 3814503 3812601 3814500 3812525 3812979 3813312 3813668 3814139 3812830 3813202 3817025 3818386 3810813 3813120 3805907 3812057 3811185

它的性能要差得多,在 128 线程测试中它大约是相同的,但我认为这是因为 Machine X 确实运行了很多东西......但在较低线程的测试中,它始终慢得多:

# sysbench --threads=128 cpu run | grep 'events per second'
    events per second: 297930.31
# sysbench --threads=64 cpu run | grep 'events per second'
    events per second: 139065.38
# sysbench --threads=32 cpu run | grep 'events per second'
    events per second: 72346.08
# sysbench --threads=16 cpu run | grep 'events per second'
    events per second: 32984.01
# sysbench --threads=1 cpu run | grep 'events per second'
    events per second:  1896.85

始终如一

# sysbench --threads=1 cpu run | grep 'events per second'
    events per second:  1901.41
# sysbench --threads=1 cpu run | grep 'events per second'
    events per second:  1939.84
# sysbench --threads=1 cpu run | grep 'events per second'
    events per second:  1805.10

完整细节:

# sysbench --threads=1 cpu run 
sysbench 1.0.18 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time

Prime numbers limit: 10000

Initializing worker threads...

Threads started!

CPU speed:
    events per second:  1933.15

General statistics:
    total time:                          10.0018s
    total number of events:              19336

Latency (ms):
         min:                                    0.10
         avg:                                    0.31
         max:                                    1.07
         95th percentile:                        0.64
         sum:                                 6016.19

Threads fairness:
    events (avg/stddev):           19336.0000/0.00
    execution time (avg/stddev):   6.0162/0.00

感谢所有帮助

相关内容