我在 Azure 中有一些虚拟机 (VM)。并且我有软锁
这个错误使整个虚拟机瘫痪,不允许我通过 SSH 访问它们。什么都不起作用
第一个支持人员告诉我是内存不足!?肯定不知道 Linux 是什么
另一个准备更充分的团队告诉我这是一个内核错误,并建议我每 X 个月更新一次 Linux 内核。但如果这是一个内核错误,那么向我推荐修补程序不是更容易吗?
谷歌给了我以下选择:
- 固件错误
- 记忆错误
- 内核错误
每次遇到这个问题,我都必须创建一个新的虚拟机并连接旧磁盘并手动复制文件,这需要 1-2 天的时间。
Ubuntu Server 22_04-lts 标准 D4s v3(4 个 vcpus,16 GiB 内存)
这次日志没有显示软锁
[ OK ] Started Deferred execution scheduler.
[ OK ] Finished Permit User Sessions.
[ OK ] Finished Initialize hardware monitoring sensors.
Starting Hold until boot process finishes up...
Starting Terminate Plymouth Boot Screen...
[ OK ] Finished Hold until boot process finishes up.
[ OK ] Started Serial Getty on ttyS0.
Starting Set console scheme...
[ 8.891999] bash[881]: Starting vCon service...
[ OK ] Started LSB: disk temperature monitoring daemon.
[ OK ] Started OpenBSD Secure Shell server.
[ OK ] Finished Terminate Plymouth Boot Screen.
[ OK ] Finished Set console scheme.
[ OK ] Created slice system-getty.slice.
[ OK ] Started Getty on tty1.
[ OK ] Reached target Login Prompts.
[ OK ] Started System Logging Service.
[ OK ] Started LSB: automatic crash report generation.
[ OK ] Started chrony, an NTP client/server.
[ OK ] Reached target System Time Synchronized.
[ OK ] Started Daily apt download activities.
[ OK ] Started Daily apt upgrade and clean activities.
[ OK ] Started Periodic ext4 Onli…ata Check for All Filesystems.
[ OK ] Started Discard unused blocks once a week.
[ OK ] Started Refresh fwupd metadata regularly.
[ OK ] Started Daily rotation of log files.
[ OK ] Started Daily man-db regeneration.
[ OK ] Started Message of the Day.
[ OK ] Reached target Timers.
[ OK ] Started LSB: chkconfig 1234 13 13 for phase1.
[ OK ] Started LSB: chkconfig 1234 25 25 for phase2.
[ OK ] Finished Remove Stale Onli…ext4 Metadata Check Snapshots.
[ OK ] Started LSB: Record successful boot for GRUB.
Starting GRUB failed boot detection...
[ OK ] Finished Ubuntu FAN network setup.
[ OK ] Started Login Service.
[ OK ] Started Unattended Upgrades Shutdown.
[ OK ] Started OMI CIM Server.
[ OK ] Finished GRUB failed boot detection.
Starting Authorization Manager...
2022-09-16T19:17:05.183110Z INFO Daemon Azure Linux Agent Version:2.2.46
2022-09-16T19:17:05.185968Z INFO Daemon OS: ubuntu 20.04
2022-09-16T19:17:05.187845Z INFO Daemon Python: 3.8.10
2022-09-16T19:17:05.189891Z INFO Daemon CGroups Status: The cgroup filesystem is ready to use
2022-09-16T19:17:05.210040Z INFO Daemon Run daemon
[ OK ] Started Authorization Manager.
2022-09-16T19:17:05.222692Z INFO Daemon cloud-init is enabled: True
2022-09-16T19:17:05.225082Z INFO Daemon Using cloud-init for provisioning
2022-09-16T19:17:05.227800Z INFO Daemon Clean protocol and wireserver endpoint
[ OK ] Started Accounts Service.
2022-09-16T19:17:05.258653Z INFO Daemon Provisioning already completed, skipping.
2022-09-16T19:17:05.261467Z INFO Daemon RDMA capabilities are not enabled, skipping
2022-09-16T19:17:05.273774Z INFO Daemon Determined Agent WALinuxAgent-2.8.0.11 to be the latest agent
[ OK ] Started containerd container runtime.
Starting Docker Application Container Engine...
2022-09-16T19:17:05.778261Z INFO ExtHandler ExtHandler The agent will now check for updates and then will process extensions. Output to /dev/console will be suspended during those operations.
[ 10.415535] bash[1177]: Unit svagent.service could not be found.
[ OK ] Started Dispatcher daemon for systemd-networkd.
[ 10.497898] bash[1196]: Unit svagent.service could not be found.
[ OK ] Created slice Slice for Azure VM Extensions.
[ OK ] Created slice Slice for Az…kWatcherAgentLinux-1.4.2294.2.
[ OK ] Started /var/lib/waagent/M….4.2294.2/./install.sh enable.
[ OK ] Finished Configuration file for vCon service.
Starting Configuration file for vxagent service...
[ 13.709311] bash[1867]: VX Agent daemon is not running!
[ 13.711711] bash[1867]: appservice daemon is not running!
[ 13.713689] bash[1867]: Starting VX Agent daemon...
[ 13.715469] bash[1867]: Filter driver kernel module is not loaded. Attempting to load it, please wait...
Ubuntu 20.04.1 LTS coi-docker-dev ttyS0
coi-docker-dev login: [ 14.800506] bash[1867]: Filter driver kernel module is not loaded...
[ 14.805457] bash[1867]: Filter driver is not loaded. Cannot create /dev/involflt !
[ 14.808985] bash[1867]: Running the command:
[ 14.810746] bash[1867]: /usr/local/ASR/Vx/bin/appservice
[ 14.812818] bash[1867]: Running the command : /usr/local/ASR/Vx/bin/svagents
[ 15.045870] bash[1867]: VX Agent daemon is not running!
[ 15.047153] bash[1867]: appservice daemon is running...
[ 15.068089] bash[2084]: Starting UA Respawn daemon...
[ 18.905665] kernel BUG at include/linux/fs.h:3103!
[ 18.908255] invalid opcode: 0000 [#1] SMP PTI
[ 18.910179] CPU: 2 PID: 2512 Comm: whoami Not tainted 5.13.0-1028-azure #33~20.04.1-Ubuntu
[ 18.913732] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
[ 18.917854] RIP: 0010:__fput+0x247/0x250
[ 18.919502] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 87 02 00 e9 b5 fe ff ff <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48
[ 18.927374] RSP: 0018:ffffa351c367be28 EFLAGS: 00010246
[ 18.929667] RAX: 0000000000000000 RBX: 00000000480a801d RCX: 0000007700000000
[ 18.932752] RDX: 0000000000000076 RSI: ffff95bdbaadc700 RDI: 0000000000000000
[ 18.935908] RBP: ffffa351c367be50 R08: 0000000000000077 R09: 0000000000000064
[ 18.938985] R10: ffffa351c367be28 R11: ffff95bdbaadc710 R12: ffff95bdbaadc700
[ 18.942124] R13: ffff95bd82dce650 R14: ffff95bd808c6de0 R15: ffff95bd82f5e900
[ 18.945239] FS: 0000000000000000(0000) GS:ffff95c12fd00000(0000) knlGS:0000000000000000
[ 18.948872] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 18.951361] CR2: 00007fffc030d259 CR3: 0000000151be2001 CR4: 00000000003706e0
[ 18.954341] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 18.957444] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 18.960573] Call Trace:
[ 18.961622] <TASK>
[ 18.962552] ____fput+0xe/0x10
[ 18.963945] task_work_run+0x6a/0xa0
[ 18.965541] exit_to_user_mode_prepare+0x280/0x290
[ 18.967554] syscall_exit_to_user_mode+0x17/0x40
[ 18.969620] do_syscall_64+0x6e/0xb0
[ 18.971298] ? do_syscall_64+0x6e/0xb0
[ 18.972934] ? irqentry_exit+0x19/0x30
[ 18.974445] ? exc_page_fault+0x83/0x160
[ 18.976000] ? asm_exc_page_fault+0x8/0x30
[ 18.977668] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 18.981001] RIP: 0033:0x7fc24ed93100
[ 18.982498] Code: Unable to access opcode bytes at RIP 0x7fc24ed930d6.
[ 18.985167] RSP: 002b:00007fffc030d080 EFLAGS: 00000200 ORIG_RAX: 000000000000003b
[ 18.988467] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 18.991381] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 18.994265] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 18.997125] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 19.000421] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 19.003522] </TASK>
[ 19.004444] Modules linked in: veth xt_nat xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat br_netfilter bridge stp llc ip6table_filter ip6_tables iptable_filter aufs overlay nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_owner iptable_security xt_tcpudp bpfilter joydev kvm_intel hid_generic kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel pata_acpi crypto_simd cryptd hid_hyperv serio_raw hv_balloon hid hyperv_keyboard hyperv_fb hv_utils hv_netvsc sch_fq_codel drm i2c_core ip_tables x_tables autofs4
[ 19.028870] ---[ end trace 4f761aeb2bf24c9a ]---
[ 19.031397] RIP: 0010:__fput+0x247/0x250
[ 19.033646] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 87 02 00 e9 b5 fe ff ff <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48
[ 19.042383] RSP: 0018:ffffa351c367be28 EFLAGS: 00010246
[ 19.045107] RAX: 0000000000000000 RBX: 00000000480a801d RCX: 0000007700000000
[ 19.048706] RDX: 0000000000000076 RSI: ffff95bdbaadc700 RDI: 0000000000000000
[ 19.052440] RBP: ffffa351c367be50 R08: 0000000000000077 R09: 0000000000000064
[ 19.055924] R10: ffffa351c367be28 R11: ffff95bdbaadc710 R12: ffff95bdbaadc700
[ 19.059398] R13: ffff95bd82dce650 R14: ffff95bd808c6de0 R15: ffff95bd82f5e900
[ 19.063378] FS: 0000000000000000(0000) GS:ffff95c12fd00000(0000) knlGS:0000000000000000
[ 19.068056] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 19.071642] CR2: 00007fc24ed930d6 CR3: 0000000151be2001 CR4: 00000000003706e0
[ 19.075474] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 19.079042] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 19.115281] eth0: renamed from veth68fae92
[ 19.148826] IPv6: ADDRCONF(NETDEV_CHANGE): veth3f7f483: link becomes ready
[ 19.153829] br-42b41f9abe8a: port 4(veth3f7f483) entered blocking state
[ 19.157551] br-42b41f9abe8a: port 4(veth3f7f483) entered forwarding state
[ 19.245270] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 19.249994] #PF: supervisor read access in kernel mode
[ 19.253898] #PF: error_code(0x0000) - not-present page
[ 19.257259] PGD 0 P4D 0
[ 19.258932] Oops: 0000 [#2] SMP PTI
[ 19.261111] CPU: 3 PID: 2539 Comm: stat Tainted: G D 5.13.0-1028-azure #33~20.04.1-Ubuntu
[ 19.266196] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
[ 19.271298] RIP: 0010:__fput+0x140/0x250
[ 19.274212] Code: e7 48 c7 c6 00 ef 70 88 e8 1d 25 e1 ff 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 49 8b 7c 24 18 4d 8d 5c 24 10 4c 8b 4f 30 <41> 0f b7 01 66 25 00 f0 66 3d 00 40 74 67 48 3b 7f 18 74 70 b9 01
[ 19.284539] RSP: 0018:ffffa351c36dbe48 EFLAGS: 00010246
[ 19.288324] RAX: 00000000000a0003 RBX: 00000000000a0003 RCX: 0000000000000000
[ 19.292655] RDX: 0000000000000002 RSI: ffff95bdbaadc700 RDI: ffff95bdcb4fb3c0
[ 19.296957] RBP: ffffa351c36dbe70 R08: ffff95bd86bfcb40 R09: 0000000000000000
[ 19.301324] R10: 0000000000000008 R11: ffff95bdb95d3910 R12: ffff95bdb95d3900
[ 19.307743] R13: ffff95bd8164d770 R14: ffff95bd808c67a0 R15: ffff95bdcb4fb3c0
[ 19.311599] FS: 0000000000000000(0000) GS:ffff95c12fd80000(0000) knlGS:0000000000000000
[ 19.316090] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 19.319520] CR2: 0000000000000000 CR3: 000000013c7b4004 CR4: 00000000003706e0
[ 19.323911] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 19.328065] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 19.332313] Call Trace:
[ 19.334557] <TASK>
[ 19.336777] ____fput+0xe/0x10
[ 19.339281] task_work_run+0x6a/0xa0
[ 19.341956] exit_to_user_mode_prepare+0x280/0x290
[ 19.345035] syscall_exit_to_user_mode+0x17/0x40
[ 19.347817] do_syscall_64+0x6e/0xb0
[ 19.350447] ? irqentry_exit+0x19/0x30
[ 19.353034] ? exc_page_fault+0x83/0x160
[ 19.355516] ? asm_exc_page_fault+0x8/0x30
[ 19.358111] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 19.360806] RIP: 0033:0x7f7533f6a100
[ 19.362860] Code: Unable to access opcode bytes at RIP 0x7f7533f6a0d6.
[ 19.366205] RSP: 002b:00007ffe7a1e7840 EFLAGS: 00000200 ORIG_RAX: 000000000000003b
[ 19.369937] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 19.373596] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 19.377091] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 19.380467] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 19.384394] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 19.387872] </TASK>
[ 19.389569] Modules linked in: veth xt_nat xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat br_netfilter bridge stp llc ip6table_filter ip6_tables iptable_filter aufs overlay nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_owner iptable_security xt_tcpudp bpfilter joydev kvm_intel hid_generic kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel pata_acpi crypto_simd cryptd hid_hyperv serio_raw hv_balloon hid hyperv_keyboard hyperv_fb hv_utils hv_netvsc sch_fq_codel drm i2c_core ip_tables x_tables autofs4
[ 19.417088] CR2: 0000000000000000
[ 19.419267] ---[ end trace 4f761aeb2bf24c9b ]---
[ 19.421779] RIP: 0010:__fput+0x247/0x250
[ 19.424259] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 87 02 00 e9 b5 fe ff ff <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48
[ 19.433940] RSP: 0018:ffffa351c367be28 EFLAGS: 00010246
[ 19.436833] RAX: 0000000000000000 RBX: 00000000480a801d RCX: 0000007700000000
[ 19.440416] RDX: 0000000000000076 RSI: ffff95bdbaadc700 RDI: 0000000000000000
[ 19.443932] RBP: ffffa351c367be50 R08: 0000000000000077 R09: 0000000000000064
[ 19.447911] R10: ffffa351c367be28 R11: ffff95bdbaadc710 R12: ffff95bdbaadc700
[ 19.451484] R13: ffff95bd82dce650 R14: ffff95bd808c6de0 R15: ffff95bd82f5e900
[ 19.455605] FS: 0000000000000000(0000) GS:ffff95c12fd80000(0000) knlGS:0000000000000000
[ 19.459706] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 19.463209] CR2: 00007f7533f6a0d6 CR3: 000000013c7b4004 CR4: 00000000003706e0
[ 19.467250] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 19.470990] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 51.593567] hv_balloon: Max. dynamic memory size: 16384 MB