AWS EC2 Ubuntu 崩溃

AWS EC2 Ubuntu 崩溃

我正在使用 AWS EC2 ubuntu 22。运行后sudo apt-get updatesudo apt-get upgrade我使用重新启动了机器sudo reboot

然后我的服务器每 2-3 小时就会崩溃一次。唯一的解决办法是从console.aws.amazon.com网站重新启动。

日志说

[   30.811643] cloud-init[293]: Cloud-init v. 23.4.4-0ubuntu0~22.04.1
running 'init-local' at Fri, 08 Mar 2024 03:15:31 +0000. Up 30.76
seconds. 
[   31.932494]
================================================================================ [   31.940631] UBSAN: array-index-out-of-bounds in
/build/linux-aws-6.5-4tw9h1/linux-aws-6.5-6.5.0/drivers/net/xen-netfront.c:1323:3
[   31.949710] index 1 is out of range for type
'xen_netif_rx_sring_entry [1]' 
[   31.965154]
================================================================================ [   32.126461]
================================================================================ [   32.134793] UBSAN: array-index-out-of-bounds in
/build/linux-aws-6.5-4tw9h1/linux-aws-6.5-6.5.0/drivers/net/xen-netfront.c:502:7
[   32.145800] index 1 is out of range for type
'xen_netif_tx_sring_entry [1]' 
[   32.152066]
================================================================================ 
[   32.159001]
================================================================================ [   32.167055] UBSAN: array-index-out-of-bounds in
/build/linux-aws-6.5-4tw9h1/linux-aws-6.5-6.5.0/drivers/net/xen-netfront.c:430:4
[   32.176208] index 1 is out of range for type
'xen_netif_tx_sring_entry [1]' 
[   32.184067]
================================================================================ [[0;32m  OK  [0m] Finished [0;1;39mInitial cloud-init job
(pre-networking)[0m. [[0;32m  OK  [0m] Reached target
[0;1;39mPreparation for Network[0m.
         Starting [0;1;39mNetwork Configuration[0m... [[0;32m  OK  [0m] Started [0;1;39mNetwork Configuration[0m.
         Starting [0;1;39mWait for Network to be Configured[0m...
         Starting [0;1;39mNetwork Name Resolution[0m... [[0;32m  OK  [0m] Started [0;1;39mNetwork Name Resolution[0m. [[0;32m  OK 
[0m] Reached target [0;1;39mNetwork[0m. [[0;32m  OK  [0m] Reached
target [0;1;39mHost and Network Name Lookups[0m. [[0;32m  OK  [0m]
Finished [0;1;39mWait for Network to be Configured[0m.
         Starting [0;1;39mInitial cloud-ini… (metadata service crawler)[0m... [   36.329724] cloud-init[323]: Cloud-init v.
23.4.4-0ubuntu0~22.04.1 running 'init' at Fri, 08 Mar 2024 03:15:37 +0000. Up 36.30 seconds. [   36.348170] cloud-init[323]: ci-info: ++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++ 
 [   36.368656]
cloud-init[323]: ci-info:
+--------+------+-----------------------------+---------------+--------+-------------------+ [   36.384720] cloud-init[323]: ci-info: | Device |  Up  |          
Address           |      Mask     | Scope  |     Hw-Address    | [  
36.406316] cloud-init[323]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+ [   36.428641] cloud-init[323]: ci-info: |  eth0  | True |        
172.31.6.227        | 255.255.240.0 | global | 0a:4c:be:e0:20:fe | [   36.444572] cloud-init[323]: ci-info: |  eth0  | True | fe80::84c:beff:fee0:20fe/64 |       .       |  link  |
0a:4c:be:e0:20:fe | [   36.464092] cloud-init[323]: ci-info: |   lo  
| True |          127.0.0.1          |   255.0.0.0   |  host  |       
.         | [   36.481933] cloud-init[323]: ci-info: |   lo   | True |
::1/128           |       .       |  host  |         .         | [  
36.490025] cloud-init[323]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+ [   36.500428] cloud-init[323]: ci-info:
+++++++++++++++++++++++++++++Route IPv4 info++++++++++++++++++++++++++++++ [   36.512666] cloud-init[323]:
ci-info:
+-------+-------------+------------+-----------------+-----------+-------+ [   36.528722] cloud-init[323]: ci-info: | Route | Destination | 
Gateway   |     Genmask     | Interface | Flags | [   36.544510]
cloud-init[323]: ci-info:
+-------+-------------+------------+-----------------+-----------+-------+ [   36.558908] cloud-init[323]: ci-info: |   0   |   0.0.0.0   |
172.31.0.1 |     0.0.0.0     |    eth0   |   UG  | [   36.567355] cloud-init[323]: ci-info: |   1   |  172.31.0.0 |  0.0.0.0   | 
255.255.240.0  |    eth0   |   U   | [   36.574037] cloud-init[323]: ci-info: |   2   |  172.31.0.1 |  0.0.0.0   | 255.255.255.255 |   
eth0   |   UH  | [   36.581876] cloud-init[323]: ci-info: |   3   | 
172.31.0.2 |  0.0.0.0   | 255.255.255.255 |    eth0   |   UH  | [   36.590398] cloud-init[323]: ci-info: +-------+-------------+------------+-----------------+-----------+-------+ [   36.598906] cloud-init[323]: ci-info: +++++++++++++++++++Route IPv6
info+++++++++++++++++++ [   36.607608] cloud-init[323]: ci-info:
+-------+-------------+---------+-----------+-------+ [   36.617366] cloud-init[323]: ci-info: | Route | Destination | Gateway | Interface
| Flags | [   36.625431] cloud-init[323]: ci-info:
+-------+-------------+---------+-----------+-------+ [   36.634057] cloud-init[323]: ci-info: |   1   |  fe80::/64  |    ::   |    eth0  
|   U   | [   36.645677] cloud-init[323]: ci-info: |   3   |    local 
|    ::   |    eth0   |   U   | [   36.655523] cloud-init[323]:
ci-info: |   4   |  multicast  |    ::   |    eth0   |   U   | [  
36.664180] cloud-init[323]: ci-info: +-------+-------------+---------+-----------+-------+ [[0;32m  OK  [0m] Finished [0;1;39mInitial cloud-ini…ob (metadata service
crawler)[0m. [[0;32m  OK  [0m] Reached target [0;1;39mCloud-config
availability[0m. [[0;32m  OK  [0m] Reached target [0;1;39mNetwork
is Online[0m. [[0;32m  OK  [0m] Reached target [0;1;39mSystem
Initialization[0m. [[0;32m  OK  [0m] Started [0;1;39mACPI Events
Check[0m. [[0;32m  OK  [0m] Started [0;1;39mDaily Cleanup of
Temporary Directories[0m. [[0;32m  OK  [0m] Started [0;1;39mUbuntu
Advantage Timer for running repeated jobs[0m. [[0;32m  OK  [0m]
Started [0;1;39mDownload data for …ailed at package install time[0m.
[[0;32m  OK  [0m] Reached target [0;1;39mPath Units[0m. [[0;32m 
OK  [0m] Listening on [0;1;39mACPID Listen Socket[0m. [[0;32m  OK 
[0m] Listening on [0;1;39mcloud-init hotplug hook socket[0m.
[[0;32m  OK  [0m] Listening on [0;1;39mD-Bus System Message Bus
Socket[0m. [[0;32m  OK  [0m] Listening on [0;1;39mOpen-iSCSI
iscsid Socket[0m. [[0;32m  OK  [0m] Listening on [0;1;39mSocket
unix for snap application lxd.daemon[0m. [[0;32m  OK  [0m]
Listening on [0;1;39mSocket unix f…p application lxd.user-daemon[0m.
         Starting [0;1;39mSocket activation for snappy daemon[0m... [[0;32m  OK  [0m] Listening on [0;1;39mUUID daemon activation
socket[0m. [[0;32m  OK  [0m] Reached target [0;1;39mPreparation
for Remote File Systems[0m. [[0;32m  OK  [0m] Reached target
[0;1;39mRemote File Systems[0m. [[0;32m  OK  [0m] Finished
[0;1;39mAvailability of block devices[0m. [[0;32m  OK  [0m]
Listening on [0;1;39mSocket activation for snappy daemon[0m.
[[0;32m  OK  [0m] Reached target [0;1;39mSocket Units[0m. [[0;32m
OK  [0m] Reached target [0;1;39mBasic System[0m. [[0;32m  OK 
[0m] Started [0;1;39mACPI event daemon[0m.
         Starting [0;1;39mThe Apache HTTP Server[0m...
         Starting [0;1;39mLSB: automatic crash report generation[0m...
         Starting [0;1;39mchrony, an NTP client/server[0m... [[0;32m  OK  [0m] Started [0;1;39mRegular background program
processing daemon[0m. [[0;32m  OK  [0m] Started [0;1;39mD-Bus
System Message Bus[0m. [[0;32m  OK  [0m] Started [0;1;39mSave
initial kernel messages after boot[0m.
         Starting [0;1;39mRemove Stale Onli…t4 Metadata Check Snapshots[0m...
         Starting [0;1;39mEC2 Instance Connect Host Key Harvesting[0m...
         Starting [0;1;39mRecord successful boot for GRUB[0m... [[0;32m  OK  [0m] Started [0;1;39mirqbalance daemon[0m.
         Starting [0;1;39mMySQL Community Server[0m...
         Starting [0;1;39mDispatcher daemon for systemd-networkd[0m...
         Starting [0;1;39mSystem Logging Service[0m... [[0;32m  OK  [0m] Started [0;1;39mService for snap
a…on-ssm-agent.amazon-ssm-agent[0m.
         Starting [0;1;39mService for snap application lxd.activate[0m... [[0;32m  OK  [0m] Started [0;1;39mUserspace
listener for prompt events[0m. [[0;32m  OK  [0m] Reached target
[0;1;39mPreparation for Logins[0m.
         Starting [0;1;39mSnap Daemon[0m...
         Starting [0;1;39mUser Login Management[0m...
         Starting [0;1;39mPermit User Sessions[0m... [[0;32m  OK  [0m] Started [0;1;39mSystem Logging Service[0m.
         Starting [0;1;39mHostname Service[0m... [[0;32m  OK  [0m] Finished [0;1;39mPermit User Sessions[0m.
         Starting [0;1;39mHold until boot process finishes up[0m...
         Starting [0;1;39mTerminate Plymouth Boot Screen[0m... [[0;32m  OK  [0m] Finished [0;1;39mHold until boot process finishes
up[0m. [[0;32m  OK  [0m] Finished [0;1;39mTerminate Plymouth Boot
Screen[0m. [[0;32m  OK  [0m] Started [0;1;39mSerial Getty on
ttyS0[0m.
         Starting [0;1;39mSet console scheme[0m... [[0;32m  OK  [0m] Finished [0;1;39mRecord successful boot for GRUB[0m.
         Starting [0;1;39mGRUB failed boot detection[0m... [[0;32m  OK  [0m] Started [0;1;39mLSB: automatic crash report generation[0m.
[[0;32m  OK  [0m] Finished [0;1;39mSet console scheme[0m. [[0;32m
OK  [0m] Created slice [0;1;39mSlice /system/getty[0m. [[0;32m  OK
[0m] Started [0;1;39mGetty on tty1[0m. [[0;32m  OK  [0m] Reached
target [0;1;39mLogin Prompts[0m. [[0;32m  OK  [0m] Started
[0;1;39mHostname Service[0m.
         Starting [0;1;39mAuthorization Manager[0m... [[0;32m  OK  [0m] Finished [0;1;39mGRUB failed boot detection[0m. [[0;32m  OK 
[0m] Started [0;1;39mUser Login Management[0m. [[0;32m  OK  [0m]
Started [0;1;39mUnattended Upgrades Shutdown[0m. [[0;32m  OK  [0m]
Started [0;1;39mchrony, an NTP client/server[0m. [[0;32m  OK  [0m]
Reached target [0;1;39mSystem Time Synchronized[0m. [[0;32m  OK 
[0m] Started [0;1;39mDaily apt download activities[0m. [[0;32m  OK
[0m] Started [0;1;39mDaily apt upgrade and clean activities[0m.
[[0;32m  OK  [0m] Started [0;1;39mRun certbot twice daily[0m.
[[0;32m  OK  [0m] Started [0;1;39mDaily dpkg database backup
timer[0m. [[0;32m  OK  [0m] Started [0;1;39mPeriodic ext4 Onli…ata
Check for All Filesystems[0m. [[0;32m  OK  [0m] Started
[0;1;39mDiscard unused blocks once a week[0m. [[0;32m  OK  [0m]
Started [0;1;39mDaily rotation of log files[0m. [[0;32m  OK  [0m]
Started [0;1;39mDaily man-db regeneration[0m. [[0;32m  OK  [0m]
Started [0;1;39mMessage of the Day[0m. [[0;32m  OK  [0m] Started
[0;1;39mClean PHP session files every 30 mins[0m. [[0;32m  OK 
[0m] Started [0;1;39mCheck to see wheth…w version of Ubuntu
available[0m. [[0;32m  OK  [0m] Reached target [0;1;39mTimer
Units[0m.
         Starting [0;1;39mClean php session files[0m... [[0;32m  OK  [0m] Started [0;1;39mAuthorization Manager[0m. [   42.292707]
================================================================================ [   42.298815] UBSAN: array-index-out-of-bounds in
/build/linux-aws-6.5-4tw9h1/linux-aws-6.5-6.5.0/drivers/net/xen-netfront.c:824:4
[   42.309057] index 136 is out of range for type
'xen_netif_tx_sring_entry [1]' [   42.318190]
================================================================================ [[0;32m  OK  [0m] Finished [0;1;39mEC2 Instance Connect Host Key
Harvesting[0m.
         Starting [0;1;39mOpenBSD Secure Shell server[0m...



Ubuntu 22.04.4 LTS ip-172-31-6-227 ttyS0

ip-172-31-6-227 login: [   57.597024] cloud-init[724]: Cloud-init v.
23.4.4-0ubuntu0~22.04.1 running 'modules:config' at Fri, 08 Mar 2024 03:15:58 +0000. Up 57.54 seconds. [   74.431229] cloud-init[826]:
Cloud-init v. 23.4.4-0ubuntu0~22.04.1 running 'modules:final' at Fri,
08 Mar 2024 03:16:15 +0000. Up 74.37 seconds. [   74.580364]
cloud-init[826]: Cloud-init v. 23.4.4-0ubuntu0~22.04.1 finished at
Fri, 08 Mar 2024 03:16:15 +0000. Datasource DataSourceEc2Local.  Up
74.57 seconds

编辑:3 月 9 日

添加 Cloudwatch 后我收到以下通知

确保实例角色具有适当的 SSM 权限,安全组允许传出,并且实例能够通过 VPC 端点或实例的 VPC 互联网网关与 SSM 通信

我已根据建议添加了政策,但几个小时后我收到另一条通知,上面写着

日志组 APPLICATION-ApplicationInsights-Will-Application-Insight 未为组件 arn:aws:ec2:ap-south-1:50882112742:instance/i-0f46de45f500d9d6 设置日志路径。

我还是不知道这是什么原因造成的?

答案1

我怀疑这是 Linux 的 Xen 驱动程序中的一个错误。几个月前已经修复了这个问题,但这个错误修复程序还没有在 Ubuntu 上得到解决。Launchpad 上有一个跟踪此问题的错误报告

我正在运行 Ubuntu 22.04 (jammy, LTS),我在日志中看到了非常相似的 UBSAN 错误 — 不过幸运的是,这并没有导致我的服务器崩溃。而且据我从代码中得知,最新的 Ubuntu (23.10/mantic) 也受到了影响。

相关内容