卸载模块时出现“OOPS”内核消息

卸载模块时出现“OOPS”内核消息

有时,当我尝试使用delete_module系统调用卸载模块时,我会收到来自内核的以下日志:

static inline int delete_module(const char *name, int flags)
{
    return syscall(__NR_delete_module, name, flags);
}

从回溯中,内核试图通过名称找到 kernfs_node 并将其删除。但在本例中名称为 NULL。这种随机问题的根本原因可能是什么?

[[ 135.142426] st_asm330lhh_spi spi2.0: Removing device 0
[[ 135.143021] st_asm330lhh_spi spi2.0: Removing device 1
[[ 135.145053] Unable to handle kernel paging request at virtual address 11ffe6ff
[[ 135.145072] pgd = d75a4000
[[ 135.145082] [11ffe6ff] *pgd=00000000
[[ 135.145097] Internal error: Oops: 5 [#1] PREEMPT ARM
[[ 135.145114] Modules linked in: st_asm330lhh(-) gpio50 [last unloaded: st_asm330lhh]
[[ 135.145139] CPU: 0 PID: 2361 Comm: lifecyclemgr Not tainted 3.18.48 #4
[[ 135.145152] task: cc93b900 ti: cc96c000 task.ti: cc96c000
[[ 135.145170] PC is at strlen+0x4/0x24
[[ 135.145187] LR is at kernfs_name_hash+0x10/0x6c
[[ 135.145201] pc : [<c01f4698>] lr : [<c012e5f0>] psr: 60010013
[[ 135.145201] sp : cc96def8 ip : 00000000 fp : b1bfe48c
[[ 135.145215] r10: 00000000 r9 : cc96c000 r8 : 00000800
[[ 135.145227] r7 : bf0060d0 r6 : 11ffe6ff r5 : 00000000 r4 : 11ffe6ff
[[ 135.145239] r3 : 11ffe6ff r2 : 00000000 r1 : 00000000 r0 : 11ffe6ff
[[ 135.145252] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[[ 135.145265] Control: 10c53c7d Table: 975a4059 DAC: 00000051
[[ 135.145277] Process lifecyclemgr (pid: 2361, stack limit = 0xcc96c208)
[[ 135.145289] Stack: (0xcc96def8 to 0xcc96e000)
[[ 135.145303] dee0: cc952b50 00000000
[[ 135.145321] df00: 11ffe6ff c012efc8 cc952870 11ffe6ff 00000000 11ffe6ff cc952870 00000000
[[ 135.145339] df20: bf0060d0 c012fd84 bf006088 00000000 cc887c00 c006d678 00000000 bf006088
[[ 135.145357] df40: c0d21008 60010013 00000800 c006d7fc 615f7473 33336d73 68686c30 00000000
[[ 135.145375] df60: 00000000 00000000 cc93b900 00000000 cc93b900 00000000 c0dcd3b8 cc96dfb0
[[ 135.145393] df80: b1bfe494 c0033d14 0096c000 29427fd7 4fe52e18 b1bfe4ac 4fe48a54 00000081
[[ 135.145411] dfa0: c000e624 c000e460 4fe52e18 b1bfe4ac 4fe52e18 00000800 00000000 00000800
[[ 135.145430] dfc0: 4fe52e18 b1bfe4ac 4fe48a54 00000081 002179bc 001c9790 0bebc200 b1bfe48c
[[ 135.145448] dfe0: b1bfe480 b1bfe470 4fe48914 4fafce30 60010010 4fe52e18 e594102c e2840024
[[ 135.145476] [<c01f4698>] (strlen) from [<c012e5f0>] (kernfs_name_hash+0x10/0x6c)
[[ 135.145500] [<c012e5f0>] (kernfs_name_hash) from [<c012efc8>] (kernfs_find_ns+0x70/0xd8)
[[ 135.145524] [<c012efc8>] (kernfs_find_ns) from [<c012fd84>] (kernfs_remove_by_name_ns+0x48/0x78)
[[ 135.145548] [<c012fd84>] (kernfs_remove_by_name_ns) from [<c006d678>] (free_module+0x184/0x1c4)
[[ 135.145569] [<c006d678>] (free_module) from [<c006d7fc>] (SyS_delete_module+0x144/0x1dc)
[[ 135.145591] [<c006d7fc>] (SyS_delete_module) from [<c000e460>] (ret_fast_syscall+0x0/0x44)
[[ 135.145609] Code: 1afffff9 e12fff1e c077e998 e1a03000 (e5d32000)
[[ 135.145622] ---[ end trace 7e807feb82cc2ca5 ]--- 

答案1

欢迎来到 Unix 和 Linux SE!

当模块被卸载时,它需要释放为其操作分配的所有系统资源。

也许您要卸载的模块存在错误,导致模块卸载过程kernfs_remove_by_name_ns()在卸载模块时使用 NULL 而不是有效参数进行调用?

快速浏览一下回溯中引用的函数表明问题可能是第二个参数为kernfs_remove_by_name_ns()NULL。

看着的源代码free_module(),按名称引用模块资源的第一件事将基于name模块中的成员struct module- 如果在删除模块时由于某种原因该成员已经为 NULL,那么将会出现 Oops。

相关内容