intel_pstate 将 CoreOS 上 Intel Xeon E5-2650 v4 的 CPU 锁定为 400 MHz

intel_pstate 将 CoreOS 上 Intel Xeon E5-2650 v4 的 CPU 锁定为 400 MHz

硬件:

  • 一个机箱中有 4 个英特尔 HNS2600TPR,配有 2 根电源线,
  • 每个配备 1 个 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz,
  • 128 GB RAM。
[root@sigma02 Linux_X64]# dmidecode
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 2.7 present.
80 structures occupying 4366 bytes.
Table at 0x7A4EC000.

Handle 0x0000, DMI type 133, 12 bytes
OEM-specific Type
        Header and Data:
                85 0C 00 00 00 B0 67 7B 00 40 00 00

Handle 0x0001, DMI type 0, 24 bytes
BIOS Information
        Vendor: Intel Corporation
        Version: SE5C610.86B.01.01.2024.041020181059
        Release Date: 04/10/2018
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 16 MB
        Characteristics:
                PCI is supported
                PNP is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                Print screen service is supported (int 5h)
                8042 keyboard services are supported (int 9h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                LS-120 boot is supported
                ATAPI Zip drive boot is supported
                BIOS boot specification is supported
                Function key-initiated network boot is supported
                Targeted content distribution is supported
                UEFI is supported
        BIOS Revision: 0.0
        Firmware Revision: 0.0

Handle 0x0002, DMI type 1, 27 bytes
System Information
        Manufacturer: Intel Corporation
        Product Name: S2600TPR
        Version: HNS2600TPR
        Serial Number: BQTP94490080
        UUID: 80c1fd42-f1ec-e811-906e-0017a4403562
        Wake-up Type: Power Switch
        SKU Number: SKU Number
        Family: Family

Handle 0x0003, DMI type 2, 17 bytes
Base Board Information
        Manufacturer: Intel Corporation
        Product Name: S2600TPR
        Version: H26989-274
        Serial Number: BQTP84500150
        Asset Tag: Base Board Asset Tag
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis: Part Component
        Chassis Handle: 0x0000
        Type: Motherboard
        Contained Object Handles: 0

Handle 0x0004, DMI type 3, 24 bytes
Chassis Information
        Manufacturer: ...............................
        Type: Rack Mount Chassis
        Lock: Not Present
        Version: ..................
        Serial Number: ..................
        Asset Tag: ....................
        Boot-up State: Safe
        Power Supply State: Safe
        Thermal State: Safe
        Security Status: None
        OEM Information: 0x00000000
        Height: Unspecified
        Number Of Power Cords: Unspecified
        Contained Elements: 0
        SKU Number: Not Specified

Handle 0x000A, DMI type 11, 5 bytes
OEM Strings
        String 1: To Be Filled By O.E.M.

Handle 0x000C, DMI type 13, 22 bytes
BIOS Language Information
        Language Description Format: Abbreviated
        Installable Languages: 1
                enUS
        Currently Installed Language: enUS

Handle 0x000D, DMI type 27, 15 bytes
Cooling Device
        Temperature Probe Handle: 0x000B
        Type: Fan
        Status: OK
        Cooling Unit Group: 1
        OEM-specific Information: 0x00000000
        Nominal Speed: Unknown Or Non-rotating
        Description: Not Specified

Handle 0x000E, DMI type 28, 22 bytes
Temperature Probe
        Description: LM78A
        Location: System Management Module
        Status: <OUT OF SPEC>
        Maximum Value: Unknown
        Minimum Value: Unknown
        Resolution: Unknown
        Tolerance: Unknown
        Accuracy: Unknown
        OEM-specific Information: 0x00000000
        Nominal Value: Unknown

Handle 0x000F, DMI type 32, 11 bytes
System Boot Information
        Status: No errors detected

Handle 0x0010, DMI type 34, 11 bytes
Management Device
        Description: UNKNOWN
        Type: Unknown
        Address: 0x00000000
        Address Type: Unknown

Handle 0x0011, DMI type 35, 11 bytes
Management Device Component
        Description: To Be Filled By O.E.M.
        Management Device Handle: 0x000D
        Component Handle: 0x000A
        Threshold Handle: 0x000F

Handle 0x0012, DMI type 36, 16 bytes
Management Device Threshold Data

Handle 0x0014, DMI type 24, 5 bytes
Hardware Security
        Power-On Password Status: Not Implemented
        Keyboard Password Status: Not Implemented
        Administrator Password Status: Disabled
        Front Panel Reset Status: Disabled

Handle 0x0018, DMI type 39, 22 bytes
System Power Supply
        Power Unit Group: 1
        Location: To Be Filled By O.E.M.
        Name: To Be Filled By O.E.M.
        Manufacturer: To Be Filled By O.E.M.
        Serial Number: To Be Filled By O.E.M.
        Asset Tag: To Be Filled By O.E.M.
        Model Part Number: To Be Filled By O.E.M.
        Revision: To Be Filled By O.E.M.
        Max Power Capacity: Unknown
        Status: Present, Unknown
        Type: Unknown
        Input Voltage Range Switching: Unknown
        Plugged: Yes
        Hot Replaceable: No
        Input Voltage Probe Handle: 0x0000
        Cooling Device Handle: 0x000A
        Input Current Probe Handle: 0x0000

Handle 0x0019, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L1-Cache
        Configuration: Enabled, Not Socketed, Level 1
        Operational Mode: Write Back
        Location: Internal
        Installed Size: 768 kB
        Maximum Size: 768 kB
        Supported SRAM Types:
                Synchronous
        Installed SRAM Type: Synchronous
        Speed: Unknown
        Error Correction Type: Single-bit ECC
        System Type: Instruction
        Associativity: 8-way Set-associative

Handle 0x001A, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L2-Cache
        Configuration: Enabled, Not Socketed, Level 2
        Operational Mode: Varies With Memory Address
        Location: Internal
        Installed Size: 3072 kB
        Maximum Size: 3072 kB
        Supported SRAM Types:
                Synchronous
        Installed SRAM Type: Synchronous
        Speed: Unknown
        Error Correction Type: Single-bit ECC
        System Type: Unified
        Associativity: 8-way Set-associative

Handle 0x001B, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L3-Cache
        Configuration: Enabled, Not Socketed, Level 3
        Operational Mode: Varies With Memory Address
        Location: Internal
        Installed Size: 30720 kB
        Maximum Size: 30720 kB
        Supported SRAM Types:
                Synchronous
        Installed SRAM Type: Synchronous
        Speed: Unknown
        Error Correction Type: Single-bit ECC
        System Type: Unified
        Associativity: 20-way Set-associative

Handle 0x001C, DMI type 4, 48 bytes
Processor Information
        Socket Designation: CPU1
        Type: Central Processor
        Family: Xeon
        Manufacturer: Intel(R) Corporation
        ID: F1 06 04 00 FF FB EB BF
        Signature: Type 0, Family 6, Model 79, Stepping 1
        Flags:
                FPU (Floating-point unit on-chip)
                VME (Virtual mode extension)
                DE (Debugging extension)
                PSE (Page size extension)
                TSC (Time stamp counter)
                MSR (Model specific registers)
                PAE (Physical address extension)
                MCE (Machine check exception)
                CX8 (CMPXCHG8 instruction supported)
                APIC (On-chip APIC hardware supported)
                SEP (Fast system call)
                MTRR (Memory type range registers)
                PGE (Page global enable)
                MCA (Machine check architecture)
                CMOV (Conditional move instruction supported)
                PAT (Page attribute table)
                PSE-36 (36-bit page size extension)
                CLFSH (CLFLUSH instruction supported)
                DS (Debug store)
                ACPI (ACPI supported)
                MMX (MMX technology supported)
                FXSR (FXSAVE and FXSTOR instructions supported)
                SSE (Streaming SIMD extensions)
                SSE2 (Streaming SIMD extensions 2)
                SS (Self-snoop)
                HTT (Multi-threading)
                TM (Thermal monitor supported)
                PBE (Pending break enabled)
        Version: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
        Voltage: 1.8 V
        External Clock: 100 MHz
        Max Speed: 4000 MHz
        Current Speed: 2200 MHz
        Status: Populated, Enabled
        Upgrade: Socket LGA2011-3
        L1 Cache Handle: 0x0019
        L2 Cache Handle: 0x001A
        L3 Cache Handle: 0x001B
        Serial Number:
        Asset Tag:
        Part Number:
        Core Count: 12
        Core Enabled: 12
        Thread Count: 24
        Characteristics:
                64-bit capable
                Multi-Core
                Hardware Thread
                Execute Protection
                Enhanced Virtualization
                Power/Performance Control

Handle 0x001D, DMI type 4, 48 bytes
Processor Information
        Socket Designation: CPU2
        Type: Central Processor
        Family: <OUT OF SPEC>
        Manufacturer: Not Specified
        ID: 00 00 00 00 00 00 00 00
        Version: Not Specified
        Voltage: Unknown
        External Clock: Unknown
        Max Speed: 4000 MHz
        Current Speed: Unknown
        Status: Unpopulated
        Upgrade: Socket LGA2011-3
        L1 Cache Handle: Not Provided
        L2 Cache Handle: Not Provided
        L3 Cache Handle: Not Provided
        Serial Number: Not Specified
        Asset Tag: Not Specified
        Part Number: Not Specified
        Characteristics: None

Handle 0x001E, DMI type 16, 23 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: Single-bit ECC
        Maximum Capacity: 128 GB
        Error Information Handle: Not Provided
        Number Of Devices: 4

Handle 0x001F, DMI type 19, 31 bytes
Memory Array Mapped Address
        Starting Address: 0x00000000000
        Ending Address: 0x00FFFFFFFFF
        Range Size: 64 GB
        Physical Array Handle: 0x001E
        Partition Width: 4

Handle 0x0020, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x001E
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 32 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM_A1
        Bank Locator: NODE 1
        Type: DDR4
        Type Detail: Synchronous
        Speed: 2400 MT/s
        Manufacturer: Kingston
        Serial Number: 042424BA
        Asset Tag:
        Part Number: 9965640-035.C00G
        Rank: 2
        Configured Memory Speed: 2400 MT/s
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: Unknown

Handle 0x0021, DMI type 20, 35 bytes
Memory Device Mapped Address
        Starting Address: 0x00000000000
        Ending Address: 0x007FFFFFFFF
        Range Size: 32 GB
        Physical Device Handle: 0x0020
        Memory Array Mapped Address Handle: 0x001F
        Partition Row Position: 1

Handle 0x0022, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x001E
        Error Information Handle: Not Provided
        Total Width: Unknown
        Data Width: Unknown
        Size: No Module Installed
        Form Factor: DIMM
        Set: None
        Locator: DIMM_A2
        Bank Locator: NODE 1
        Type: DDR4
        Type Detail: Synchronous
        Speed: Unknown
        Manufacturer: NO DIMM
        Serial Number: NO DIMM
        Asset Tag:
        Part Number: NO DIMM
        Rank: Unknown
        Configured Memory Speed: Unknown
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: Unknown

Handle 0x0023, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x001E
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 32 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM_B1
        Bank Locator: NODE 1
        Type: DDR4
        Type Detail: Synchronous
        Speed: 2400 MT/s
        Manufacturer: Kingston
        Serial Number: 042418D8
        Asset Tag:
        Part Number: 9965640-035.C00G
        Rank: 2
        Configured Memory Speed: 2400 MT/s
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: Unknown

Handle 0x0024, DMI type 20, 35 bytes
Memory Device Mapped Address
        Starting Address: 0x00800000000
        Ending Address: 0x00FFFFFFFFF
        Range Size: 32 GB
        Physical Device Handle: 0x0023
        Memory Array Mapped Address Handle: 0x001F
        Partition Row Position: 2

Handle 0x0025, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x001E
        Error Information Handle: Not Provided
        Total Width: Unknown
        Data Width: Unknown
        Size: No Module Installed
        Form Factor: DIMM
        Set: None
        Locator: DIMM_B2
        Bank Locator: NODE 1
        Type: DDR4
        Type Detail: Synchronous
        Speed: Unknown
        Manufacturer: NO DIMM
        Serial Number: NO DIMM
        Asset Tag:
        Part Number: NO DIMM
        Rank: Unknown
        Configured Memory Speed: Unknown
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: Unknown

[...]

Handle 0x004A, DMI type 9, 17 bytes
System Slot Information
        Designation: Riser 4, slot 1
        Type: x16 PCI Express 3 x16
        Current Usage: Available
        Length: Long
        ID: 0
        Characteristics:
                3.3 V is provided
                PME signal is supported
        Bus Address: 0000:80:02.0

Handle 0x004B, DMI type 10, 14 bytes
On Board Device 1 Information
        Type: Video
        Status: Enabled
        Description: ServerEngines Pilot III
On Board Device 2 Information
        Type: Ethernet
        Status: Enabled
        Description: Intel I350
On Board Device 3 Information
        Type: SATA Controller
        Status: Enabled
        Description: PCH Integrated SATA Controller
On Board Device 4 Information
        Type: SATA Controller
        Status: Enabled
        Description: PCH Integrated sSATA Controller
On Board Device 5 Information
        Type: Ethernet
        Status: Enabled
        Description: Intel I350

Handle 0x000B, DMI type 12, 5 bytes
System Configuration Options
        Option 1: J7C2: Close to boot with MFG mode
        Option 2: J7B2 2-3: Close for BIOS Image swap
        Option 3: J5D2 2-3: ME force update
        Option 4: J7A7 2-3: Recovery BIOS
        Option 5: J7A6 2-3: Clear password
        Option 6: J7A3 2-3: Clear CMOS register
        Option 7: J7A2 2-3: Force BMC update
        Option 8: J7B3 1-2: PLD program enable
        Option 9: J6C2: RAID key

Handle 0x0015, DMI type 41, 11 bytes
Onboard Device
        Reference Designation: ServerEngines Pilot III
        Type: Video
        Status: Enabled
        Type Instance: 1
        Bus Address: 0000:05:00.0

Handle 0x0016, DMI type 41, 11 bytes
Onboard Device
        Reference Designation: Intel I350
        Type: Ethernet
        Status: Enabled
        Type Instance: 1
        Bus Address: 0000:06:00.0

Handle 0x0017, DMI type 41, 11 bytes
Onboard Device
        Reference Designation: PCH Integrated SATA Controller
        Type: SATA Controller
        Status: Enabled
        Type Instance: 1
        Bus Address: 0000:00:1f.2

Handle 0x004C, DMI type 41, 11 bytes
Onboard Device
        Reference Designation: PCH Integrated sSATA Controller
        Type: SATA Controller
        Status: Enabled
        Type Instance: 2
        Bus Address: 0000:00:11.4

Handle 0x004D, DMI type 41, 11 bytes
Onboard Device
        Reference Designation: Intel I350
        Type: Ethernet
        Status: Enabled
        Type Instance: 2
        Bus Address: 0000:06:00.1

Handle 0x004E, DMI type 148, 48 bytes
OEM-specific Type
        Header and Data:
                94 30 4E 00 02 01 02 03 04 05 06 07 08 09 0A 0B
                0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B
                1C 1D 1E 1F 20 21 22 23 00 00 00 00 00 00 00 00
        Strings:
                SE5C610.86B.01.01.2024.041020181059
                 1.81.11142
                3.1.3.43
                SDR Package 1.17
                FRU Ver 1.00
                SDR File 1.17
                M6F306F2_0000003C
                MEF406F1_0B00002A
                N/A
                BF176FCB
                19A708C8
                4.3.0
                4.3.0
                0.9.77
                N/A
                N/A
                N/A
                N/A
                N/A
                72DACB44
                8E0AAD10
                2F3F5F52
                80C5846D
                6AF7E746
                N/A
                3E8576C0
                N/A
                291CD502
                04BD7E53
                46D789BD
                AA029E92
                N/A
                N/A
                N/A
                N/A

Handle 0xFEFF, DMI type 127, 4 bytes
End Of Table

软件:

  • 运行不同版本的 CoreOS:

2512.2.0 发布日期:2020 年 5 月 19 日 内核:4.19.123 rkt:1.30.0 docker:18.06.3 etcd:3.3.20 systemd:241 Ignition:0.34.0

2345.3.0 发布日期:2020 年 3 月 2 日 内核:4.19.106 rkt:1.30.0 docker:18.06.3 etcd:3.3.18 systemd:241 Ignition:0.33.0

  • Kubernetes 1.17.0

有时,一些节点的所有 CPU 核心都会降至 400 MHz,如下所示:

sigma01 sigma # cat /proc/cpuinfo
processor       : 23
vendor_id       : GenuineIntel
cpu family      : 6
model           : 79
model name      : Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
stepping        : 1
microcode       : 0xb000038
cpu MHz         : 412.535
cache size      : 30720 KB
physical id     : 0
siblings        : 24
core id         : 13
cpu cores       : 12
apicid          : 27
initial apicid  : 27
fpu             : yes
fpu_exception   : yes
cpuid level     : 20
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts md_clear flush_l1d
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit
bogomips        : 4389.81
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:
Every 2.0s: cat /proc/cpuinfo | grep MHz                                                                                  sigma01: Fri May 22 13:44:33 2020

cpu MHz         : 422.084
cpu MHz         : 413.291
cpu MHz         : 420.521
cpu MHz         : 421.059
cpu MHz         : 417.286
cpu MHz         : 417.869
cpu MHz         : 419.568
cpu MHz         : 413.913
cpu MHz         : 416.606
cpu MHz         : 416.767
cpu MHz         : 418.188
cpu MHz         : 422.938
cpu MHz         : 413.258
cpu MHz         : 414.553
cpu MHz         : 409.921
cpu MHz         : 407.358
cpu MHz         : 410.833
cpu MHz         : 413.726
cpu MHz         : 417.325
cpu MHz         : 414.957
cpu MHz         : 411.737
cpu MHz         : 415.100
cpu MHz         : 413.458
cpu MHz         : 411.024
sigma03 sigma # ls /sys/devices/system/cpu/cpufreq/policy0/
affected_cpus  cpuinfo_max_freq  cpuinfo_min_freq  cpuinfo_transition_latency  related_cpus  scaling_available_governors  scaling_cur_freq  scaling_driver  scaling_governor  scaling_max_freq  scaling_min_freq  scaling_setspeed

sigma03 sigma # cat /sys/devices/system/cpu/cpufreq/policy0/scaling_driver
intel_pstate

sigma03 sigma # cat /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed
<unsupported>
[root@sigma01 ~]# cpupower frequency-info
sh: modprobe: command not found
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 1.20 GHz - 2.90 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 1.20 GHz and 2.90 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 426 MHz (asserted by call to kernel)
  boost state support:
    Supported: yes
    Active: yes

在 BIOS 中,HNS2600TPR 的电源管理设置为性能模式,风扇也设置为性能模式。SpeedStep 已启用。

运行以下命令一段时间并不能解决问题,但是 CPU MHz 值将为 411,不会有太大波动。

echo $(seq 1 24) | xargs -P 24 -n 1 sh -c 'while :;do :; done'或者stress --cpu 24

没有负载时,CPU MHz 值在 400 到 430 之间波动。

然后我禁用intel_pstateintel_idle使用驱动程序:

set linux_append="$linux_append intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=passive"

現在的驅動程序是intel_cpufreq

[root@sigma02 ~]# cpupower frequency-info
sh: modprobe: command not found
analyzing CPU 0:
  driver: intel_cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 20.0 us
  hardware limits: 1.20 GHz - 2.90 GHz
  available cpufreq governors: performance
  current policy: frequency should be within 1.20 GHz and 2.00 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 1.20 GHz (asserted by call to kernel)
  boost state support:
    Supported: yes
    Active: yes

times现在缺少C 状态/sys/devices/cpu/...,该状态会计算每个睡眠状态的使用次数。这没有任何改善。

intel_pstate已正确设置:

sigma04 sigma # cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
100

当内核参数改变并且服务器重新启动时,在短时间内,它们会使用正确的 CPU 频率,大约 2500 MHz,但方式却不可预测。

我不确定这些信息是否足以帮助任何人解决问题,但一些提示对于如何系统地深入研究和寻找线索会很有用。

目前,4 个节点中有 3 个节点的 CPU 时钟频率正常,约为 2.5 GHz,1 个节点卡在 400 Mhz。其他节点卡在 400 MHz。

当 CPU 处于 400 MHz 时,其温度约为 25-30 C。

更新

sudo reboot发现了一些东西,刚刚发生了这样的事情: - 当所有 4 个节点都被限制到 400 MHz 时。 -同时发出两个节点。 - 剩下的两个节点达到最大速度,如下所示:

Every 2.0s: cat /proc/cpuinfo | grep MHz                                                                                                                                                                    sigma02: Fri May 22 20:16:53 2020

cpu MHz         : 2494.244
cpu MHz         : 2507.850
cpu MHz         : 2502.095
cpu MHz         : 2494.222
cpu MHz         : 2501.193
cpu MHz         : 2494.445
[...]
  • 当两个节点重新上线时,所有节点的速度都会被限制到 400 Mhz。

如上所述,这些节点“位于一个机箱中,使用 2 条电源线”。这是否意味着一条电源线出现故障?为什么这种情况每 1-2 个月发生一次?为什么这么忙乱?它肯定受到其他节点是否在线的影响,但不清楚。例如,现在,有 2 个节点在线,2 个的最大速度为 2500 MHz,而 1 个正在启动,1 个处于半节流状态:

Every 2.0s: cat /proc/cpuinfo | grep MHz                                                                                  sigma03: Fri May 22 20:22:11 2020

cpu MHz         : 1197.514
cpu MHz         : 1197.706
cpu MHz         : 1197.370
cpu MHz         : 1197.358

此外,无论或节点的当前负载、状态或uptime,当所有节点同时进入400MHz时。

还有其他问题吗?网络故障?

非常感谢您的帮助!

答案1

当您弄乱BIOS 设置而它仍然不起作用时,可能还存在更深层次的问题intel_pstateintel_idle

您可能需要检查电源。大多数机箱都提供备用电源,即使备用电源处于待机状态,主电源也可能无法为节点提供足够的电力。

拔出主电源后问题就解决了。所有节点都以 2500 MHz 运行。重新插上电源后,它现在闪烁着黄灯。它很可能出现故障,但在整个过程中,辅助电源并未接管。

相关内容