我在 Azure 中有一些虚拟机。并且我有软锁

我在 Azure 中有一些虚拟机。并且我有软锁

我在 Azure 中有一些虚拟机 (VM)。并且我有软锁

这个错误使整个虚拟机瘫痪,不允许我通过 SSH 访问它们。什么都不起作用

第一个支持人员告诉我是内存不足!?肯定不知道 Linux 是什么

另一个准备更充分的团队告诉我这是一个内核错误,并建议我每 X 个月更新一次 Linux 内核。但如果这是一个内核错误,那么向我推荐修补程序不是更容易吗?

谷歌给了我以下选择:

  • 固件错误
  • 记忆错误
  • 内核错误

每次遇到这个问题,我都必须创建一个新的虚拟机并连接旧磁盘并手动复制文件,这需要 1-2 天的时间。

Ubuntu Server 22_04-lts 标准 D4s v3(4 个 vcpus,16 GiB 内存)

这次日志没有显示软锁

[  OK  ] Started Deferred execution scheduler.
[  OK  ] Finished Permit User Sessions.
[  OK  ] Finished Initialize hardware monitoring sensors.
         Starting Hold until boot process finishes up...
         Starting Terminate Plymouth Boot Screen...
[  OK  ] Finished Hold until boot process finishes up.
[  OK  ] Started Serial Getty on ttyS0.
         Starting Set console scheme...
[    8.891999] bash[881]: Starting vCon service...
[  OK  ] Started LSB: disk temperature monitoring daemon.
[  OK  ] Started OpenBSD Secure Shell server.
[  OK  ] Finished Terminate Plymouth Boot Screen.
[  OK  ] Finished Set console scheme.
[  OK  ] Created slice system-getty.slice.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started System Logging Service.
[  OK  ] Started LSB: automatic crash report generation.
[  OK  ] Started chrony, an NTP client/server.
[  OK  ] Reached target System Time Synchronized.
[  OK  ] Started Daily apt download activities.
[  OK  ] Started Daily apt upgrade and clean activities.
[  OK  ] Started Periodic ext4 Onli…ata Check for All Filesystems.
[  OK  ] Started Discard unused blocks once a week.
[  OK  ] Started Refresh fwupd metadata regularly.
[  OK  ] Started Daily rotation of log files.
[  OK  ] Started Daily man-db regeneration.
[  OK  ] Started Message of the Day.
[  OK  ] Reached target Timers.
[  OK  ] Started LSB: chkconfig 1234 13 13 for phase1.
[  OK  ] Started LSB: chkconfig 1234 25 25 for phase2.
[  OK  ] Finished Remove Stale Onli…ext4 Metadata Check Snapshots.
[  OK  ] Started LSB: Record successful boot for GRUB.
         Starting GRUB failed boot detection...
[  OK  ] Finished Ubuntu FAN network setup.
[  OK  ] Started Login Service.
[  OK  ] Started Unattended Upgrades Shutdown.
[  OK  ] Started OMI CIM Server.
[  OK  ] Finished GRUB failed boot detection.
         Starting Authorization Manager...
2022-09-16T19:17:05.183110Z INFO Daemon Azure Linux Agent Version:2.2.46
2022-09-16T19:17:05.185968Z INFO Daemon OS: ubuntu 20.04
2022-09-16T19:17:05.187845Z INFO Daemon Python: 3.8.10
2022-09-16T19:17:05.189891Z INFO Daemon CGroups Status: The cgroup filesystem is ready to use
2022-09-16T19:17:05.210040Z INFO Daemon Run daemon
[  OK  ] Started Authorization Manager.
2022-09-16T19:17:05.222692Z INFO Daemon cloud-init is enabled: True
2022-09-16T19:17:05.225082Z INFO Daemon Using cloud-init for provisioning
2022-09-16T19:17:05.227800Z INFO Daemon Clean protocol and wireserver endpoint
[  OK  ] Started Accounts Service.
2022-09-16T19:17:05.258653Z INFO Daemon Provisioning already completed, skipping.
2022-09-16T19:17:05.261467Z INFO Daemon RDMA capabilities are not enabled, skipping
2022-09-16T19:17:05.273774Z INFO Daemon Determined Agent WALinuxAgent-2.8.0.11 to be the latest agent
[  OK  ] Started containerd container runtime.
         Starting Docker Application Container Engine...
2022-09-16T19:17:05.778261Z INFO ExtHandler ExtHandler The agent will now check for updates and then will process extensions. Output to /dev/console will be suspended during those operations.
[   10.415535] bash[1177]: Unit svagent.service could not be found.
[  OK  ] Started Dispatcher daemon for systemd-networkd.
[   10.497898] bash[1196]: Unit svagent.service could not be found.
[  OK  ] Created slice Slice for Azure VM Extensions.
[  OK  ] Created slice Slice for Az…kWatcherAgentLinux-1.4.2294.2.
[  OK  ] Started /var/lib/waagent/M….4.2294.2/./install.sh enable.
[  OK  ] Finished Configuration file for vCon service.
         Starting Configuration file for vxagent service...
[   13.709311] bash[1867]: VX Agent daemon is not running!
[   13.711711] bash[1867]: appservice daemon is not running!
[   13.713689] bash[1867]: Starting VX Agent daemon...
[   13.715469] bash[1867]: Filter driver kernel module is not loaded. Attempting to load it, please wait...

Ubuntu 20.04.1 LTS coi-docker-dev ttyS0

coi-docker-dev login: [   14.800506] bash[1867]: Filter driver kernel module is not loaded...
[   14.805457] bash[1867]: Filter driver is not loaded. Cannot create /dev/involflt !
[   14.808985] bash[1867]: Running the command:
[   14.810746] bash[1867]: /usr/local/ASR/Vx/bin/appservice
[   14.812818] bash[1867]: Running the command : /usr/local/ASR/Vx/bin/svagents
[   15.045870] bash[1867]: VX Agent daemon is not running!
[   15.047153] bash[1867]: appservice daemon is running...
[   15.068089] bash[2084]: Starting UA Respawn daemon...
[   18.905665] kernel BUG at include/linux/fs.h:3103!
[   18.908255] invalid opcode: 0000 [#1] SMP PTI
[   18.910179] CPU: 2 PID: 2512 Comm: whoami Not tainted 5.13.0-1028-azure #33~20.04.1-Ubuntu
[   18.913732] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
[   18.917854] RIP: 0010:__fput+0x247/0x250
[   18.919502] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 87 02 00 e9 b5 fe ff ff <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48
[   18.927374] RSP: 0018:ffffa351c367be28 EFLAGS: 00010246
[   18.929667] RAX: 0000000000000000 RBX: 00000000480a801d RCX: 0000007700000000
[   18.932752] RDX: 0000000000000076 RSI: ffff95bdbaadc700 RDI: 0000000000000000
[   18.935908] RBP: ffffa351c367be50 R08: 0000000000000077 R09: 0000000000000064
[   18.938985] R10: ffffa351c367be28 R11: ffff95bdbaadc710 R12: ffff95bdbaadc700
[   18.942124] R13: ffff95bd82dce650 R14: ffff95bd808c6de0 R15: ffff95bd82f5e900
[   18.945239] FS:  0000000000000000(0000) GS:ffff95c12fd00000(0000) knlGS:0000000000000000
[   18.948872] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   18.951361] CR2: 00007fffc030d259 CR3: 0000000151be2001 CR4: 00000000003706e0
[   18.954341] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   18.957444] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   18.960573] Call Trace:
[   18.961622]  <TASK>
[   18.962552]  ____fput+0xe/0x10
[   18.963945]  task_work_run+0x6a/0xa0
[   18.965541]  exit_to_user_mode_prepare+0x280/0x290
[   18.967554]  syscall_exit_to_user_mode+0x17/0x40
[   18.969620]  do_syscall_64+0x6e/0xb0
[   18.971298]  ? do_syscall_64+0x6e/0xb0
[   18.972934]  ? irqentry_exit+0x19/0x30
[   18.974445]  ? exc_page_fault+0x83/0x160
[   18.976000]  ? asm_exc_page_fault+0x8/0x30
[   18.977668]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   18.981001] RIP: 0033:0x7fc24ed93100
[   18.982498] Code: Unable to access opcode bytes at RIP 0x7fc24ed930d6.
[   18.985167] RSP: 002b:00007fffc030d080 EFLAGS: 00000200 ORIG_RAX: 000000000000003b
[   18.988467] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   18.991381] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   18.994265] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[   18.997125] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   19.000421] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   19.003522]  </TASK>
[   19.004444] Modules linked in: veth xt_nat xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat br_netfilter bridge stp llc ip6table_filter ip6_tables iptable_filter aufs overlay nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_owner iptable_security xt_tcpudp bpfilter joydev kvm_intel hid_generic kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel pata_acpi crypto_simd cryptd hid_hyperv serio_raw hv_balloon hid hyperv_keyboard hyperv_fb hv_utils hv_netvsc sch_fq_codel drm i2c_core ip_tables x_tables autofs4
[   19.028870] ---[ end trace 4f761aeb2bf24c9a ]---
[   19.031397] RIP: 0010:__fput+0x247/0x250
[   19.033646] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 87 02 00 e9 b5 fe ff ff <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48
[   19.042383] RSP: 0018:ffffa351c367be28 EFLAGS: 00010246
[   19.045107] RAX: 0000000000000000 RBX: 00000000480a801d RCX: 0000007700000000
[   19.048706] RDX: 0000000000000076 RSI: ffff95bdbaadc700 RDI: 0000000000000000
[   19.052440] RBP: ffffa351c367be50 R08: 0000000000000077 R09: 0000000000000064
[   19.055924] R10: ffffa351c367be28 R11: ffff95bdbaadc710 R12: ffff95bdbaadc700
[   19.059398] R13: ffff95bd82dce650 R14: ffff95bd808c6de0 R15: ffff95bd82f5e900
[   19.063378] FS:  0000000000000000(0000) GS:ffff95c12fd00000(0000) knlGS:0000000000000000
[   19.068056] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   19.071642] CR2: 00007fc24ed930d6 CR3: 0000000151be2001 CR4: 00000000003706e0
[   19.075474] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   19.079042] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   19.115281] eth0: renamed from veth68fae92
[   19.148826] IPv6: ADDRCONF(NETDEV_CHANGE): veth3f7f483: link becomes ready
[   19.153829] br-42b41f9abe8a: port 4(veth3f7f483) entered blocking state
[   19.157551] br-42b41f9abe8a: port 4(veth3f7f483) entered forwarding state
[   19.245270] BUG: kernel NULL pointer dereference, address: 0000000000000000
[   19.249994] #PF: supervisor read access in kernel mode
[   19.253898] #PF: error_code(0x0000) - not-present page
[   19.257259] PGD 0 P4D 0 
[   19.258932] Oops: 0000 [#2] SMP PTI
[   19.261111] CPU: 3 PID: 2539 Comm: stat Tainted: G      D           5.13.0-1028-azure #33~20.04.1-Ubuntu
[   19.266196] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
[   19.271298] RIP: 0010:__fput+0x140/0x250
[   19.274212] Code: e7 48 c7 c6 00 ef 70 88 e8 1d 25 e1 ff 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 49 8b 7c 24 18 4d 8d 5c 24 10 4c 8b 4f 30 <41> 0f b7 01 66 25 00 f0 66 3d 00 40 74 67 48 3b 7f 18 74 70 b9 01
[   19.284539] RSP: 0018:ffffa351c36dbe48 EFLAGS: 00010246
[   19.288324] RAX: 00000000000a0003 RBX: 00000000000a0003 RCX: 0000000000000000
[   19.292655] RDX: 0000000000000002 RSI: ffff95bdbaadc700 RDI: ffff95bdcb4fb3c0
[   19.296957] RBP: ffffa351c36dbe70 R08: ffff95bd86bfcb40 R09: 0000000000000000
[   19.301324] R10: 0000000000000008 R11: ffff95bdb95d3910 R12: ffff95bdb95d3900
[   19.307743] R13: ffff95bd8164d770 R14: ffff95bd808c67a0 R15: ffff95bdcb4fb3c0
[   19.311599] FS:  0000000000000000(0000) GS:ffff95c12fd80000(0000) knlGS:0000000000000000
[   19.316090] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   19.319520] CR2: 0000000000000000 CR3: 000000013c7b4004 CR4: 00000000003706e0
[   19.323911] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   19.328065] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   19.332313] Call Trace:
[   19.334557]  <TASK>
[   19.336777]  ____fput+0xe/0x10
[   19.339281]  task_work_run+0x6a/0xa0
[   19.341956]  exit_to_user_mode_prepare+0x280/0x290
[   19.345035]  syscall_exit_to_user_mode+0x17/0x40
[   19.347817]  do_syscall_64+0x6e/0xb0
[   19.350447]  ? irqentry_exit+0x19/0x30
[   19.353034]  ? exc_page_fault+0x83/0x160
[   19.355516]  ? asm_exc_page_fault+0x8/0x30
[   19.358111]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   19.360806] RIP: 0033:0x7f7533f6a100
[   19.362860] Code: Unable to access opcode bytes at RIP 0x7f7533f6a0d6.
[   19.366205] RSP: 002b:00007ffe7a1e7840 EFLAGS: 00000200 ORIG_RAX: 000000000000003b
[   19.369937] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   19.373596] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   19.377091] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[   19.380467] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   19.384394] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   19.387872]  </TASK>
[   19.389569] Modules linked in: veth xt_nat xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat br_netfilter bridge stp llc ip6table_filter ip6_tables iptable_filter aufs overlay nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_owner iptable_security xt_tcpudp bpfilter joydev kvm_intel hid_generic kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel pata_acpi crypto_simd cryptd hid_hyperv serio_raw hv_balloon hid hyperv_keyboard hyperv_fb hv_utils hv_netvsc sch_fq_codel drm i2c_core ip_tables x_tables autofs4
[   19.417088] CR2: 0000000000000000
[   19.419267] ---[ end trace 4f761aeb2bf24c9b ]---
[   19.421779] RIP: 0010:__fput+0x247/0x250
[   19.424259] Code: 00 48 85 ff 0f 84 8b fe ff ff f6 c7 40 0f 85 82 fe ff ff e8 ab 38 00 00 e9 78 fe ff ff 4c 89 f7 e8 2e 87 02 00 e9 b5 fe ff ff <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 31 db 48
[   19.433940] RSP: 0018:ffffa351c367be28 EFLAGS: 00010246
[   19.436833] RAX: 0000000000000000 RBX: 00000000480a801d RCX: 0000007700000000
[   19.440416] RDX: 0000000000000076 RSI: ffff95bdbaadc700 RDI: 0000000000000000
[   19.443932] RBP: ffffa351c367be50 R08: 0000000000000077 R09: 0000000000000064
[   19.447911] R10: ffffa351c367be28 R11: ffff95bdbaadc710 R12: ffff95bdbaadc700
[   19.451484] R13: ffff95bd82dce650 R14: ffff95bd808c6de0 R15: ffff95bd82f5e900
[   19.455605] FS:  0000000000000000(0000) GS:ffff95c12fd80000(0000) knlGS:0000000000000000
[   19.459706] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   19.463209] CR2: 00007f7533f6a0d6 CR3: 000000013c7b4004 CR4: 00000000003706e0
[   19.467250] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   19.470990] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   51.593567] hv_balloon: Max. dynamic memory size: 16384 MB

更多信息 https://www.suse.com/support/kb/doc/?id=000018705

相关内容