我开发了一个模块,用作块设备的模拟器。当我写入块设备时,我在 dmesg 中收到此消息,并且模块崩溃。我无法得到任何关于发生了什么事的提示?
[82013.054224] CPU: 9 PID: 15452 Comm: my_blk/0 Tainted: G B I E 3.19.0+ #1
[82013.054226] Hardware name: Dell Inc. PowerEdge R730xd/0599V5, BIOS 1.0.4 08/28/2014
[82013.054229] ffffffff81aa8fb8 ffff881fe1613778 ffffffff817a7f98 0000000000000000
[82013.054234] 0000000000000009 ffff881fe16137a8 ffffffff813c45b5 ffff880030243600
[82013.054239] ffff881fe0a4c798 ffff883feb3ced00 ffff881fe00c3900 ffff881fe16137b8
[82013.054244] Call Trace:
[82013.054251] [<ffffffff817a7f98>] dump_stack+0x4f/0x7b
[82013.054257] [<ffffffff813c45b5>] check_preemption_disabled+0xf5/0x110
[82013.054262] [<ffffffff813c4607>] debug_smp_processor_id+0x17/0x20
[82013.054276] [<ffffffffc03599dd>] megasas_build_io_fusion+0x54d/0x5a0 [megaraid_sas]
[82013.054287] [<ffffffffc0359af1>] megasas_build_and_issue_cmd_fusion+0x71/0x110 [megaraid_sas]
[82013.054296] [<ffffffffc034cf35>] megasas_queue_command+0x145/0x1b0 [megaraid_sas]
[82013.054301] [<ffffffff8154ae03>] scsi_dispatch_cmd+0x103/0x370
[82013.054306] [<ffffffff8154dcbf>] scsi_request_fn+0x4af/0x6c0
[82013.054311] [<ffffffff81374177>] __blk_run_queue+0x37/0x50
[82013.054315] [<ffffffff81374dd1>] queue_unplugged+0x41/0xf0
[82013.054320] [<ffffffff8137a042>] blk_flush_plug_list+0x1d2/0x210
[82013.054325] [<ffffffff8137a098>] blk_finish_plug+0x18/0x50
[82013.054331] [<ffffffff8127e54b>] ext4_writepages+0x55b/0xd10
[82013.054336] [<ffffffff812144ad>] ? __mnt_drop_write+0x2d/0x50
[82013.054342] [<ffffffff8109d624>] ? finish_task_switch+0x64/0x110
[82013.054348] [<ffffffff81187ea0>] do_writepages+0x20/0x40
[82013.054352] [<ffffffff8117c1a9>] __filemap_fdatawrite_range+0x59/0x60
[82013.054356] [<ffffffff8117c1e7>] filemap_write_and_wait_range+0x37/0x80
[82013.054360] [<ffffffff8127376a>] ext4_sync_file+0x12a/0x390
///// calling some functions in my_blk
[82013.054397] [<ffffffff81097b19>] kthread+0xc9/0xe0
[82013.054402] [<ffffffff81097a50>] ? flush_kthread_worker+0x90/0x90
[82013.054407] [<ffffffff817af7bc>] ret_from_fork+0x7c/0xb0
[82013.054412] [<ffffffff81097a50>] ? flush_kthread_worker+0x90/0x90
答案1
第一行:CPU:9 PID:15452 通讯:my_blk/0 污染:GB
从这里:https://www.novell.com/support/kb/doc.php?id=3582750
被污染的旗帜乙表示:发现进程处于坏页状态,表明虚拟内存子系统损坏,可能是由 RAM 或高速缓存内存故障引起的。
这意味着一个可能的原因是您对块设备的写入可能会寻址到虚拟机中的错误空间,并以某种方式损坏虚拟机子系统。
答案2
CONFIG_PREEMPT=y
当我设置内核配置时会发生此问题。为了解决 linux 3.19.0 上的问题,我必须应用以下补丁,该补丁更改smp_processor_id()
为.该补丁位于以下链接:raw_smp_processor_id()
drivers/scsi/megaraid/megaraid_sas_io_fusion.c