Sensu 补救措施:无法在客户端触发

Sensu 补救措施:无法在客户端触发

我是 Sensu 修复的新手。我尝试使用自定义脚本重新启动进程。

场景:如果我的 http 链接断开,那么我想手动运行脚本并启动它。

我尝试了 sensu 上的修复,它允许在监控出现问题时自动运行脚本,并使用自定义脚本进行监控检查。然而,我面临的问题是,所有检查和连接都很好,但当我的链接断开时,sensu 修复程序不会触发客户端。我已经发布了日志和配置,请告诉我我哪里出错了……

这是 Sensu 服务器日志

{"timestamp":"2016-05-16T09:44:52.768622+0000","level":"info","message":"processing event","event":{"id":"9a9f66c2-e70e-45fb-87fb-c9e9085c8e05","client":{"name":"zubron","address":"10.0.0.110","subscriptions":["zubron"],"version":"0.20.3","timestamp":1463391880},"check":{"command":"/etc/sensu/plugins/check_http -H 10.0.0.110 -p 7077","interval":60,"occurrences":2,"handlers":["remediator"],"subscribers":["zubron"],"standalone":false,"remediation":{"remediate-zubron":{"occurrences":[1,3],"severities":[2]},"trigger_on":["zubron"]},"name":"check-zubron-port","issued":1463391892,"executed":1463391892,"duration":0.002,"output":"connect to address 10.0.0.110 and port 7077: Connection refused\nHTTP CRITICAL - Unable to open TCP socket\n","status":2,"history":["0","0","0","2","2","2","2","2","2","2","2","2","2","2","2","2","2","2","2","2","2"],"total_state_change":4},"occurrences":18,"action":"create","timestamp":1463391892}}
{"timestamp":"2016-05-16T09:44:52.864908+0000","level":"info","message":"handler output","handler":{"command":"/etc/sensu/handlers/sensu.rb","type":"pipe","severities":["critical"],"name":"remediator"},"output":["/etc/sensu/handlers/sensu.rb:108:in `[]': can't convert String into Integer (TypeError)\n","\tfrom /etc/sensu/handlers/sensu.rb:108:in `block in parse_remediations'\n","\tfrom /etc/sensu/handlers/sensu.rb:106:in `each'\n","\tfrom /etc/sensu/handlers/sensu.rb:106:in `parse_remediations'\n","\tfrom /etc/sensu/handlers/sensu.rb:90:in `handle'\n","\tfrom /var/lib/gems/1.9.1/gems/sensu-plugin-1.2.0/lib/sensu-handler.rb:55:in `block in <class:Handler>'\n","REMEDIATION: Evaluating remediation: zubron {\"remediate-zubron\"=>{\"occurrences\"=>[1, 3], \"severities\"=>[2]}, \"trigger_on\"=>[\"zubron\"]} #=18 sev=2\n"]}

这是我在 Sensu-server 上的检查文件......

{
    "checks": {
            "check-zubron-port": {
                    "command": "/etc/sensu/plugins/check_http -H 10.0.0.110 -p 7077",
                    "interval": 60,
                    "occurrences": 2,
                    "handlers": [
                            "remediator"
                    ],
                    "subscribers": [
                            "zubron"
                    ],
                    "standalone": false,
                    "remediation": {
                            "remediate-zubron": {
                                    "occurrences": [
                                            1,
                                            3
                                    ],
                                    "severities": [
                                            2
                                    ]
                            },
                            "trigger_on": [
                                    "zubron"
                            ]


                  }
                }
        }
}

这是我的补救文件...

{
        "remediate-zubron": {
            "command": "sudo /bin/bash ~/zubron/home/moofwd-zubron-server/bin/start-moofwd.sh",
            "handlers": [],
            "subscribers": [
                "zubron"
            ],
            "standalone": false,
            "publish": false
    }
}

我从这里使用了 Rest sensu.rb关联

我是否遗漏了什么?

如果出现任何问题,是否有其他监控系统可供我们运行脚本或命令?

我已经尝试过 nagios nectar 和 monit。

答案1

我发现您的补救文件有错误。您缺少“检查”键。

应该是这样的,

{
    "checks": {
        "remediate-zubron": {
            "command": "sudo /bin/bash ~/zubron/home/moofwd-zubron-server/bin/start-moofwd.sh",
            "handlers": [],
            "subscribers": [
                "zubron"
            ],
            "standalone": false,
            "publish": false
        }
    }
}
  1. 其他一些可能的问题是您的客户端配置应该订阅您的名称,即名称university应该university在订阅者中订阅。

    {
        "client": {
            "name": "university",
            "address": "IP ADDRESS",
            "subscriptions": [
              "linux", "web-server", "system", "university"
            ]
        }
    }
    
  2. 另一个问题可能与 remedediator.rb 或 sensu.rb 中的 api_request(:POST) 有关,以下方法应与下面给出的完全相同。在某些旧代码中,它有 ,'/checks/request'而不是 ,'/request'并导致中断。

    def trigger_remediation(check, subscribers)
        api_request(:POST, '/request') do |req|
            req.body = JSON.dump('check' => check, 'subscribers' => subscribers)
        end
    end
    
  3. 在某些情况下,您需要同时修复情况 2 和情况 3。

以下是参考链接。让您更清楚地了解该问题。

https://github.com/sensu/sensu-community-plugins/issues/1162

相关内容