我刚刚发现thermald以防止机器过热。我想听听关于如何修改 xml 配置文件的一些基本建议。下面是我在 上的一个/etc/thermald/thermal-conf.xml
。从我在网上浏览的一些示例来看,它似乎设置为在 55 C 时开始防止过热(如果我读得<Temperature>55000</Temperature>
正确的话),但我的核心在风扇运转的情况下甚至达到 94 C。
我正在使用一台Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
机器。
<?xml version="1.0"?>
<!--
use "man thermal-conf.xml" for details
-->
<!-- BEGIN -->
<ThermalConfiguration>
<Platform>
<Name>Generic X86 Laptop Device</Name>
<ProductName>EXAMPLE_SYSTEM</ProductName>
<Preference>QUIET</Preference>
<ThermalSensors>
<ThermalSensor>
<Type>TSKN</Type>
<AsyncCapable>1</AsyncCapable>
</ThermalSensor>
</ThermalSensors>
<ThermalZones>
<ThermalZone>
<Type>SKIN</Type>
<TripPoints>
<TripPoint>
<SensorType>TSKN</SensorType>
<Temperature>55000</Temperature>
<type>passive</type>
<ControlType>SEQUENTIAL</ControlType>
<CoolingDevice>
<index>1</index>
<type>rapl_controller</type>
<influence> 100 </influence>
<SamplingPeriod> 16 </SamplingPeriod>
</CoolingDevice>
<CoolingDevice>
<index>2</index>
<type>intel_powerclamp</type>
<influence> 100 </influence>
<SamplingPeriod> 12 </SamplingPeriod>
</CoolingDevice>
</TripPoint>
</TripPoints>
</ThermalZone>
</ThermalZones>
</Platform>
<!-- Thermal configuration example only -->
<Platform>
<Name>Example Platform Name</Name>
<!--UUID is optional, if present this will be matched -->
<!-- Both product name and UUID can contain
wild card "*", which matches any platform
-->
<UUID>Example UUID</UUID>
<ProductName>Example Product Name</ProductName>
<Preference>QUIET</Preference>
<ThermalSensors>
<ThermalSensor>
<!-- New Sensor with a type and path -->
<Type>example_sensor_1</Type>
<Path>/some_path</Path>
<AsyncCapable>0</AsyncCapable>
</ThermalSensor>
<ThermalSensor>
<!-- Already present in thermal sysfs,
enable this or add/change config
For example, here we are indicating that
sensor can do async events to avoid polling
-->
<Type>example_thermal_sysfs_sensor</Type>
<!-- If async capable, then we don't need to poll -->
<AsyncCapable>1</AsyncCapable>
</ThermalSensor>
<ThermalSensor>
<!-- Examle of a virtual sensor. This sensor
depends on other real sensor or
virtual sensor.
E.g. here the temp will be
temp of example_sensor_1 * 0.5 + 10
-->
<Type>example_virtual_sensor</Type>
<Virtual>1</Virtual>
<SensorLink>
<SensorType>example_sensor_1</SensorType>
<Multiplier> 0.5 </Multiplier>
<Offset> 10 </Offset>
</SensorLink>
</ThermalSensor>
</ThermalSensors>
<ThermalZones>
<ThermalZone>
<Type>Example Zone type</Type>
<TripPoints>
<TripPoint>
<SensorType>example_sensor_1</SensorType>
<!-- Temperature at which to take action -->
<Temperature> 75000 </Temperature>
<!-- max/passive/active
If a MAX type is specified, then
daemon will use PID control
to aggresively throttle to avoid
reaching this temp.
-->
<type>max</type>
<!-- SEQUENTIAL | PARALLEL
When a trip point temp is violated, then
number of cooling device can be activated.
If control type is SEQUENTIAL then
It will exhaust first cooling device before trying
next.
-->
<ControlType>SEQUENTIAL</ControlType>
<CoolingDevice>
<index>1</index>
<type>example_cooling_device</type>
<!-- Influence will be used order cooling devices.
First cooling device will be used, which has
highest influence.
-->
<influence> 100 </influence>
<!-- Delay in using this cdev, this takes some time
too actually cool a zone
-->
<SamplingPeriod> 12 </SamplingPeriod>
</CoolingDevice>
</TripPoint>
</TripPoints>
</ThermalZone>
</ThermalZones>
<CoolingDevices>
<CoolingDevice>
<!--
Cooling device can be specified
by a type and optionally a sysfs path
If the type already present in thermal sysfs
no need of a path.
Compensation can use min/max and step size
to increasing cool the system.
Debounce period can be used to force
a waiting period for action
-->
<Type>example_cooling_device</Type>
<MinState>0</MinState>
<IncDecStep>10</IncDecStep>
<ReadBack> 0 </ReadBack>
<MaxState>50</MaxState>
<DebouncePeriod>5000</DebouncePeriod>
<!--
If there are no PID parameter
compensation increase step wise and exponentaially
if single step is not able to change trend.
Alternatively a PID parameters can be specified
then next step will use PID calculation using
provided PID constants.
-->>
<PidControl>
<kp>0.001</kp>
<kd>0.0001</kd>
<ki>0.0001</ki>
</PidControl>
</CoolingDevice>
</CoolingDevices>
</Platform>
</ThermalConfiguration>
<!-- END -->
根据@heynnema的建议,我删除了配置文件,停止thermald
并运行sudo thermald --no-daemon --loglevel=info
。以下是输出,但这如何帮助我构建一个新的、更高效的配置文件?
$ sudo thermald --no-daemon --loglevel=info
[1649408071][INFO]RAPL domain count 1
[1649408071][INFO]RAPL domain count 1
[1649408071][MSG]22 CPUID levels; family:model:stepping 0x6:a5:2 (6:165:2)
[1649408071][INFO]Running on a vanilla kernel
[1649408071][MSG]Polling mode is enabled: 4
[1649408071][INFO]sensor_update: type TSKN
[1649408071][INFO]sensor_update: type acpitz
[1649408071][INFO]sensor_update: type x86_pkg_temp
[1649408071][INFO]sensor_update: type pch_cometlake
[1649408071][INFO]sensor_update: type NGFF
[1649408071][INFO]sensor_update: type TMEM
[1649408071][INFO]sensor_update: type B0D4
[1649408071][INFO]sensor_update: type TVGA
[1649408071][INFO]thd_read_default_thermal_sensors loaded 8 sensors
[1649408071][INFO]dts /sys/devices/platform/coretemp.0/name doesn't exist
[1649408071][WARN]sensor id 11 : No temp sysfs for reading raw temp
[1649408071][WARN]sensor id 11 : No temp sysfs for reading raw temp
[1649408071][WARN]sensor id 11 : No temp sysfs for reading raw temp
[1649408071][INFO]INT3400 Base path is
[1649408071][INFO]failed to open /dev/acpi_thermal_rel
[1649408071][INFO]failed to open /dev/acpi_thermal_rel
[1649408071][INFO]TRT/ART read failed
[1649408071][INFO]Using config file /etc/thermald/thermal-conf.xml
I/O warning : failed to load external entity "/etc/thermald/thermal-conf.xml"
[1649408071][WARN]error: could not parse file /etc/thermald/thermal-conf.xml
[1649408071][INFO]sensor index:2 TSKN /sys/class/thermal/thermal_zone2/ Async:0
[1649408071][INFO]sensor index:0 acpitz /sys/class/thermal/thermal_zone0/ Async:0
[1649408071][INFO]sensor index:7 x86_pkg_temp /sys/class/thermal/thermal_zone7/ Async:1
[1649408071][INFO]sensor index:5 pch_cometlake /sys/class/thermal/thermal_zone5/ Async:0
[1649408071][INFO]sensor index:3 NGFF /sys/class/thermal/thermal_zone3/ Async:0
[1649408071][INFO]sensor index:1 TMEM /sys/class/thermal/thermal_zone1/ Async:0
[1649408071][INFO]sensor index:6 B0D4 /sys/class/thermal/thermal_zone6/ Async:0
[1649408071][INFO]sensor index:4 TVGA /sys/class/thermal/thermal_zone4/ Async:0
[1649408071][INFO]sensor index:8 hwmon /sys/class/hwmon/hwmon5/temp1_input Async:0
[1649408071][INFO]sensor index:9 hwmon /sys/class/hwmon/hwmon5/temp2_input Async:0
[1649408071][INFO]sensor index:10 hwmon /sys/class/hwmon/hwmon5/temp3_input Async:0
[1649408071][INFO]thd_read_default_cooling devices loaded 14 cdevs
[1649408071][INFO]ppcc limits max:47000000 min:10000000 min_win:28000000 step:1000000
[1649408071][INFO]set_pid_param 14 [-1000.100,10]
[1649408071][INFO]Use Default pstate drv settings
[1649408071][INFO]sysfs create failed
[1649408071][INFO]INT3400 Base path is
[1649408071][INFO]failed to open /dev/acpi_thermal_rel
[1649408071][INFO]failed to open /dev/acpi_thermal_rel
[1649408071][INFO]TRT/ART read failed
[1649408071][INFO]Using config file /etc/thermald/thermal-conf.xml
I/O warning : failed to load external entity "/etc/thermald/thermal-conf.xml"
[1649408071][WARN]error: could not parse file /etc/thermald/thermal-conf.xml
[1649408071][INFO]name = package-0
[1649408071][INFO]name = dram
[1649408071][INFO]sysfs read failed /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:2/constraint_0_max_power_uw
[1649408071][INFO]:powercap RAPL invalid max power limit range
[1649408071][INFO]Calculate dynamically phy_max
[1649408071][INFO]set_pid_param 18 [-0.4.0,0]
[1649408071][INFO]13: ath10k_thermal, C:0 MN: 0 MX:100 ST:1 pt:/sys/class/thermal/ rd_bk 1
[1649408071][INFO]1: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]11: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]8: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]6: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]4: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]2: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]12: intel_powerclamp, C:-1 MN: 0 MX:50 ST:5 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]0: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]10: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]9: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]7: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]5: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]3: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0
[1649408071][INFO]14: rapl_controller, C:47000000 MN: 47000000 MX:10000000 Inc ST:-2000000 Dec ST:-1000000 pt:/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/ rd_bk 1
[1649408071][INFO]15: intel_pstate, C:0 MN: 0 MX:10 ST:1 pt:/sys/devices/system/cpu/intel_pstate/ rd_bk 1
[1649408071][INFO]16: rapl_controller_dram, C:100000000 MN: 100000000 MX:0 ST:-500000 pt:/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:2/ rd_bk 1
[1649408071][INFO]17: LCD, C:0 MN: 0 MX:120000 ST:12000 pt:/sys/class/backlight/intel_backlight/ rd_bk 1
[1649408071][INFO]18: amdgpu, C:0 MN: 0 MX:0 ST:0 pt: rd_bk 1
[1649408071][INFO]thd_read_default_thermal_zones loaded 7 zones
[1649408071][INFO]INT3400 Base path is
[1649408071][INFO]zone cpu will be created
[1649408071][INFO]dts zone /sys/devices/platform/coretemp.0/name doesn't exist
[1649408071][INFO]/sys/class/hwmon/hwmon6/name->dell_smm
[1649408071][INFO]/sys/class/hwmon/hwmon4/name->pch_cometlake
[1649408071][INFO]/sys/class/hwmon/hwmon2/name->BAT0
[1649408071][INFO]/sys/class/hwmon/hwmon0/name->AC
[1649408071][INFO]/sys/class/hwmon/hwmon7/name->ath10k_hwmon
[1649408071][INFO]/sys/class/hwmon/hwmon5/name->coretemp
[1649408071][INFO]Buggy max temp: to close to critical 90000
[1649408071][INFO]Core temp DTS :critical 100000, max 90000, psv 95000
[1649408071][INFO]node type: Element, name: CoolingDevice value: rapl_controller
[1649408071][INFO]node type: Element, name: CoolingDevice value: intel_pstate
[1649408071][INFO]node type: Element, name: CoolingDevice value: intel_powerclamp
[1649408071][INFO]node type: Element, name: CoolingDevice value: cpufreq
[1649408071][INFO]node type: Element, name: CoolingDevice value: Processor
[1649408071][INFO]CDEVS order specified in thermal-cpu-cdev-order.xml
[1649408071][INFO]/sys/class/hwmon/hwmon3/name->nouveau
[1649408071][INFO]/sys/class/hwmon/hwmon1/name->acpitz
[1649408071][INFO]INT3400 Base path is
[1649408071][INFO]failed to open /dev/acpi_thermal_rel
[1649408071][INFO]failed to open /dev/acpi_thermal_rel
[1649408071][INFO]TRT/ART read failed
[1649408071][INFO]Using config file /etc/thermald/thermal-conf.xml
I/O warning : failed to load external entity "/etc/thermald/thermal-conf.xml"
[1649408071][WARN]error: could not parse file /etc/thermald/thermal-conf.xml
[1649408071][INFO]
ZONE DUMP BEGIN
[1649408071][INFO]
[1649408071][INFO]Zone 8: cpu, Active:1 Bind:0 Sensor_cnt:1
[1649408071][INFO]..sensors..
[1649408071][INFO]sensor index:7 x86_pkg_temp /sys/class/thermal/thermal_zone7/ Async:1
[1649408071][INFO]..trips..
[1649408071][INFO]index 0: type:passive temp:95000 hyst:0 zone id:8 sensor id:65535 control_type:1 cdev size:4
[1649408071][INFO]cdev[0] rapl_controller, Sampling period: 0
[1649408071][INFO] target_state:not defined
[1649408071][INFO]cdev[1] intel_pstate, Sampling period: 0
[1649408071][INFO] target_state:not defined
[1649408071][INFO]cdev[2] intel_powerclamp, Sampling period: 0
[1649408071][INFO] target_state:not defined
[1649408071][INFO]cdev[3] Processor, Sampling period: 0
[1649408071][INFO] target_state:not defined
[1649408071][INFO]index 1: type:polling temp:85500 hyst:0 zone id:8 sensor id:7 control_type:0 cdev size:0
[1649408071][INFO]
[1649408071][INFO]
ZONE DUMP END
[1649408071][INFO]Current user preference is 0
[1649408071][INFO]thd_engine_thread begin
编辑后,这是我的配置文件,但核心温度上升到 90 C:
~$ cat /etc/thermald/thermal-conf.xml
<?xml version="1.0"?>
<ThermalConfiguration>
<Platform>
<Name>Generic X86 Laptop Device</Name>
<ProductName>*</ProductName>
<Preference>QUIET</Preference>
<ThermalZones>
<ThermalZone>
<Type>cpu</Type>
<TripPoints>
<TripPoint>
<SensorType>x86_pkg_temp</SensorType>
<Temperature>55000</Temperature>
<type>passive</type>
<ControlType>PARALLEL</ControlType>
</TripPoint>
</TripPoints>
</ThermalZone>
</ThermalZones>
</Platform>
</ThermalConfiguration>
附加信息:
~$ ls -al /etc/thermald
total 32
drwxr-xr-x 2 root root 4096 Apr 8 16:32 .
drwxr-xr-x 159 root root 12288 Apr 5 09:03 ..
-rw-r--r-- 1 root root 4605 Jan 15 2019 backup
-rw-rw-r-- 1 username username 816 Apr 8 16:32 thermal-conf.xml
-rw-r--r-- 1 root root 508 Jan 15 2019 thermal-cpu-cdev-order.xml
而且这似乎也相关(thermald不活跃?):
$ sudo systemctl status thermald
● thermald.service - Thermal Daemon Service
Loaded: loaded (/lib/systemd/system/thermald.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Fri 2022-04-08 10:54:28 CEST; 1 weeks 0 days ago
Main PID: 1328 (code=exited, status=0/SUCCESS)
Apr 07 11:51:51 Precision-3551 thermald[1328]: error: could not parse file /etc/thermald/thermal-conf.xml
Apr 07 11:51:51 Precision-3551 thermald[1328]: I/O warning : failed to load external entity "/etc/thermald/thermal-conf.xml"
Apr 07 11:51:51 Precision-3551 thermald[1328]: error: could not parse file /etc/thermald/thermal-conf.xml
Apr 07 11:51:51 Precision-3551 thermald[1328]: I/O warning : failed to load external entity "/etc/thermald/thermal-conf.xml"
Apr 07 11:51:51 Precision-3551 thermald[1328]: error: could not parse file /etc/thermald/thermal-conf.xml
Apr 08 10:54:26 Precision-3551 systemd[1]: Stopping Thermal Daemon Service...
Apr 08 10:54:26 Precision-3551 thermald[1328]: Terminating ...
Apr 08 10:54:27 Precision-3551 thermald[1328]: terminating on user request ..
Apr 08 10:54:28 Precision-3551 systemd[1]: thermald.service: Succeeded.
Apr 08 10:54:28 Precision-3551 systemd[1]: Stopped Thermal Daemon Service.
我现在已重新激活它,sudo service thermald restart
现在:
$ sudo systemctl status thermald
● thermald.service - Thermal Daemon Service
Loaded: loaded (/lib/systemd/system/thermald.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2022-04-15 22:26:23 CEST; 2s ago
Main PID: 609438 (thermald)
Tasks: 2 (limit: 18622)
Memory: 1.3M
CGroup: /system.slice/thermald.service
└─609438 /usr/sbin/thermald --systemd --dbus-enable --adaptive
Apr 15 22:26:23 Precision-3551 systemd[1]: Starting Thermal Daemon Service...
Apr 15 22:26:23 Precision-3551 systemd[1]: Started Thermal Daemon Service.
Apr 15 22:26:23 Precision-3551 thermald[609438]: 22 CPUID levels; family:model:stepping 0x6:a5:2 (6:165:2)
Apr 15 22:26:23 Precision-3551 thermald[609438]: 22 CPUID levels; family:model:stepping 0x6:a5:2 (6:165:2)
Apr 15 22:26:23 Precision-3551 thermald[609438]: Polling mode is enabled: 4
Apr 15 22:26:23 Precision-3551 thermald[609438]: sensor id 11 : No temp sysfs for reading raw temp
Apr 15 22:26:23 Precision-3551 thermald[609438]: sensor id 11 : No temp sysfs for reading raw temp
Apr 15 22:26:23 Precision-3551 thermald[609438]: sensor id 11 : No temp sysfs for reading raw temp
答案1
来自评论:
关于如何配置 thermald 的课程可能需要一段时间。首先检查man thermald
和man thermal-conf.xml
。您使用的 thermal-conf.xml 文件是通用文件,仅作为示例。首先将其全部删除,然后重新启动 thermald。如果找不到 .xml 文件,它将尝试在默认配置下运行。看看它是如何工作的。否则,停止 thermald,然后使用手动运行它,sudo thermald --no-daemon --loglevel=info
让 thermald 告诉您它自己找到了什么,然后使用它来编写您自己的 .xml 文件。
这是我的 thermal-conf.xml 文件...
<?xml version="1.0"?>
<ThermalConfiguration>
<Platform>
<Name>Dell Inspiron-7700-AIO</Name>
<ProductName>*</ProductName>
<Preference>QUIET</Preference>
<ThermalZones>
<ThermalZone>
<Type>cpu</Type>
<TripPoints>
<TripPoint>
<SensorType>x86_pkg_temp</SensorType>
<Temperature>65000</Temperature>
<type>passive</type>
<ControlType>PARALLEL</ControlType>
<CoolingDevice>
<index>0</index>
<type>Fan</type>
<influence>30</influence>
<SamplingPeriod>10</SamplingPeriod>
</CoolingDevice>
<CoolingDevice>
<index>5</index>
<type>Processor</type>
<influence>80</influence>
<SamplingPeriod>5</SamplingPeriod>
</CoolingDevice>
<CoolingDevice>
<index>13</index>
<type>intel_powerclamp</type>
<influence>100</influence>
<SamplingPeriod>5</SamplingPeriod>
</CoolingDevice>
</TripPoint>
</TripPoints>
</ThermalZone>
</ThermalZones>
</Platform>
</ThermalConfiguration>
更新#1:
最小的 thermal-conf.xml 文件...
只需编辑 <Name>、<SensorType> 和 <Temperature> 值。然后重新启动 thermald 作为守护进程,或手动观察发生了什么。
<?xml version="1.0"?>
<ThermalConfiguration>
<Platform>
<Name>Generic</Name>
<ProductName>*</ProductName>
<Preference>QUIET</Preference>
<ThermalZones>
<ThermalZone>
<Type>cpu</Type>
<TripPoints>
<TripPoint>
<SensorType>x86_pkg_temp</SensorType>
<Temperature>55000</Temperature>
</TripPoint>
</TripPoints>
</ThermalZone>
</ThermalZones>
</Platform>
</ThermalConfiguration>
要对 CPU 进行压力测试并观察温度变化情况,首先安装Vitals
https://extensions.gnome.org/extension/1460/vitals/并将其设置为显示 CPU 封装温度和风扇速度。然后在终端中输入“YES”,观察 CPU 温度的变化。您也可以安装应用程序stress
来执行与“YES”相同的操作,但控制性更强。