我们正在运行具有以下服务器规格的 SQL Server 2019 (Linux) CU4:
- 操作系统:Centos 8
- 内存:32GB
- 硬盘:1TB 固态硬盘
- CPU:8核I-7
- SQL:Microsoft SQL Server 2019 (RTM-CU4) (KB4548597) - Linux 上的 15.0.4033.1 (X64) 标准版(64 位)(CentOS Linux 8(核心))
SQL 服务器在重负载后变得无响应(看起来),并且在崩溃后 SQLDump 文件中包含以下内容...
2020-04-09 02:38:11.48 spid12s AppDomain 3 (mssqlsystemresource.dbo[runtime].2) is marked for unload due to memory pressure.
2020-04-09 02:38:11.48 spid12s AppDomain 3 (mssqlsystemresource.dbo[runtime].2) unloaded.
2020-04-09 10:09:43.60 spid31s AppDomain 2 (master.sys[runtime].1) is marked for unload due to memory pressure.
2020-04-09 10:09:43.61 spid31s AppDomain 2 (master.sys[runtime].1) unloaded.
2020-04-10 11:52:15.26 Backup Database backed up. Database: readyalert, creation date(time): 2020/02/11(08:19:22), pages dumped: 190987, first LSN: 67110:336:1, last LSN: 67110:360:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'/var/opt/mssql/backup/readyalert_prod.BAK'}). This is an informational message only. No user action is required.
2020-04-10 11:52:15.28 Backup BACKUP DATABASE successfully processed 190746 pages in 4.842 seconds (307.765 MB/sec).
2020-04-12 00:00:02.54 spid59 [8]. Feature Status: PVS: 0. CTR: 0. ConcurrentPFSUpdate: 1.
2020-04-12 00:00:12.34 spid59 DBCC CHECKDB (readyalert) executed by NT AUTHORITY\NETWORK SERVICE found 0 errors and repaired 0 errors. Elapsed time: 0 hours 0 minutes 9 seconds. Internal database snapshot has split point LSN = 00010627:00010e78:0001 and first LSN = 00010627:00010e68:0001.
2020-04-15 11:12:58.53 spid59 AppDomain 4 (mssqlsystemresource.dbo[runtime].3) created.
2020-04-15 11:16:28.21 spid88 AppDomain 5 (master.sys[runtime].4) created.
2020-04-15 14:39:02.71 Server Using 'dbghelp.dll' version '4.0.5'
2020-04-15 14:39:02.76 Server ***Unable to get thread context for spid 0
2020-04-15 14:39:02.76 Server * *******************************************************************************
2020-04-15 14:39:02.76 Server *
2020-04-15 14:39:02.76 Server * BEGIN STACK DUMP:
2020-04-15 14:39:02.76 Server * 04/15/20 14:39:02 spid 384
2020-04-15 14:39:02.76 Server *
2020-04-15 14:39:02.76 Server * Non-yielding Scheduler
2020-04-15 14:39:02.76 Server *
2020-04-15 14:39:02.76 Server * *******************************************************************************
2020-04-15 14:39:02.77 Server Stack Signature for the dump is 0x0000000000000338
当我们尝试更新到最新的 CU4 但问题仍然存在时,我们将不胜感激。
答案1
对于我们来说,解决方案是禁用节点之间的数据库同步。微软可能开始使用事务来同步数据,并与主服务器事务锁定,因为它们可能在单独的线程中运行。因此赛车状况并最终不屈服
最新 15.0.4043.16-4
2020-07-06 13:04:01.87 Server Using 'dbghelp.dll' version '4.0.5'
2020-07-06 13:04:01.96 Server ***Unable to get thread context for spid 0
2020-07-06 13:04:01.96 Server * *******************************************************************************
2020-07-06 13:04:01.96 Server *
2020-07-06 13:04:01.96 Server * BEGIN STACK DUMP:
2020-07-06 13:04:01.97 Server * 07/06/20 13:04:01 spid 400
2020-07-06 13:04:01.97 Server *
2020-07-06 13:04:01.97 Server * Non-yielding Scheduler
2020-07-06 13:04:01.97 Server *
2020-07-06 13:04:01.97 Server * *******************************************************************************
2020-07-06 13:04:01.98 Server Stack Signature for the dump is 0x0000000000000246
2020-07-06 13:04:55.94 Server DumpCallbackHk::CreateCabFile: at 2206 g_CabDdfFile file is null, normally because no collectors added any files, exiting
2020-07-06 13:04:55.94 Server DumpCallbackHk::CollectFiles: at 594 failed at CreateCabFile with error: 0x80004005
2020-07-06 13:04:57.21 Server External dump process return code 0x20000001.
External dump process returned no errors.
2020-07-06 13:04:57.21 Server Process 0:0:0 (0x4298) Worker 0x0000001D4CE46160 appears to be non-yielding on Scheduler 0. Thread creation time: 13238513480709. Approx Thread CPU Used: kernel 130 ms, user 7180 ms. Process Utilization 16%. System Idle 0%. Interval: 70006 ms.
...