我知道 SQL Server 不会释放内存,除非操作系统需要。那么,监控可用字节(空闲内存)并不是监控服务的最佳方式。还有什么其他变量可以让我真正衡量 SQL Server 的行为?也许是每秒页面数或每秒页面错误数?我使用 nagios 来监控服务,有时会因为执行了大型查询而发出警报。
答案1
您可以使用 Perfmon 计数器(尤其是 Total Server Memory)获取 SQL 的正确内存使用情况。使用 Nagios 时,您可以使用NS客户端++并使用 nrpe 轮询这些计数器。以下是一些 SQL 计数器供您参考(带有愚蠢警告和严重级别):
define service {
service_description SQL Buffer Cache Hit Ratio
use generic-service
hostgroup_name database_servers
check_command check_nrpe_counter!BufferCacheHitRatio!\\SQLServer:Buffer Manager\\Buffer cache hit ratio!100!200
}
define service {
service_description SQL Full Scans Per Sec
use generic-service
hostgroup_name database_servers
check_command check_nrpe_counter!scans!\\SQLServer:Access Methods\\Full Scans/sec!100!200
}
define service {
service_description SQL Latch Wait Time in MS
use generic-service
hostgroup_name database_servers
check_command check_nrpe_counter!ms!\\SQLServer:Latches\\Total Latch Wait Time (ms)!100!200
}
define service {
service_description SQL Batch Req Per Sec
use generic-service
hostgroup_name database_servers
check_command check_nrpe_counter!req!\\SQLServer:SQL Statistics\\Batch Requests/sec!100!200
}
define service {
service_description SQL Re-Compilations Per Sec
use generic-service
hostgroup_name database_servers
check_command check_nrpe_counter!req!\\SQLServer:SQL Statistics\\SQL Re-Compilations/sec!100!200
}
define service {
service_description SQL Total Memory in KB
use generic-service
hostgroup_name database_servers
check_command check_nrpe_counter!mem!\\SQLServer:Memory Manager\\Total Server Memory (KB)!100!200
}
define service {
service_description SQL Memory Grants Pending
use generic-service
hostgroup_name database_servers
check_command check_nrpe_counter!PenMemGrant!\\SQLServer:Memory Manager\\Memory Grants Pending!100!200
}
#define service {
# service_description SQL Mirror Log Remaining For Undo
# use generic-service
# check_command check_nrpe_counter!kb!\\SQLServer:Database Mirroring(StackOverflow)\\Log Remaining for Undo KB!100!200
#}
#define service {
# service_description SQL Mirror Log Send Queue
# use generic-service
# check_command check_nrpe_counter!kb!\\SQLServer:Database Mirroring(StackOverflow)\\Log Remaining for Undo KB!100!200
#}
#define service {
# service_description SQL Mirror Redo Queue
# use generic-service
# check_command check_nrpe_counter!kb!\\SQLServer:Database Mirroring(StackOverflow)\\Log Remaining for Undo KB!100!200
#}
define service {
service_description SQL Number of Connected Users
use generic-service
hostgroup_name database_servers
check_command check_nrpe_counter!UserCon!\\SQLServer:General Statistics\\User Connections!100!200
}
命令定义示例:
define command {
command_name check_nrpe_counter
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c CheckCounter -a "Counter:$ARG1$=$ARG2$" ShowAll MaxWarn=$ARG3$ MaxCrit=$ARG4$
}
最后,您可能会对 Brent Ozar 推荐的 SQL Perfmon 计数器感兴趣:http://www.brentozar.com/archive/2006/12/dba-101-using-perfmon-for-sql-performance-tuning/