munin-run mysql_插件返回数据但 munin 服务器出现“错误退出”错误

munin-run mysql_插件返回数据但 munin 服务器出现“错误退出”错误

我有一个 Munin 服务器,上面有几个 Munin 节点,所有节点都使用 Ansible 配置,因此配置基本相同。系统是 Debian Jessie。

其中两台服务器的 mysql_* 图表存在但为空,而其他三台服务器则有完整的图表。

munin-run --debug mysql_commands在其中一个故障服务器中运行返回正确的输出:

# munin-run --debug mysql_commands
# Processing plugin configuration from /etc/munin/plugin-conf.d/ansible.conf
# Processing plugin configuration from /etc/munin/plugin-conf.d/munin-node
# Setting /rgid/ruid/ to /119/0/
# Setting /egid/euid/ to /119 119/0/
# Setting up environment
# Environment mysqluser = debian-sys-maint
# Environment mysqlconnection = DBI:mysql:mysql;mysql_read_default_file=/etc/mysql/debian.cnf
# Environment mysqlopts = --defaults-file=/etc/mysql/debian.cnf
# About to run '/etc/munin/plugins/mysql_commands'
Com_delete.value 4546376
Com_insert.value 2804559
Com_insert_select.value 341479
Com_load.value 0
Com_replace.value 0
Com_replace_select.value 0
Com_select.value 236967004
Com_update.value 7069348
Com_update_multi.value 0

看来 Munin 节点运行正常。

但是当我munin-cron在 munin 服务器中执行时,插件bad exit在获取时返回:

# sudo -u munin munin-cron --host server.com --debug
...
2018/01/09 08:35:01 [DEBUG] for my mysql_commands (irqstats mysql_innodb_tnx diskstats apache_accesses smart_sda df ntp_offset vmstat fw_packets mysql_select_types netstat users mysql_qcache mysql_table_locks swap mysql_slow mysql_myisam_indexes if_err_eth0 apt hddtemp_smartctl ntp_kernel_err ntp_kernel_pll_freq mysql_innodb_rows mysql_innodb_bpool forks ntp_kernel_pll_off cpu load postfix_mailvolume postfix_mailqueue mysql_qcache_mem open_inodes processes http_loadtime open_files uptime mysql_commands mysql_innodb_semaphores apt_all mysql_innodb_io smart_sdb memory mysql_connections mysql_files_tables mysql_network_traffic df_inode mysql_sorts if_eth0 entropy proc_pri mysql_innodb_bpool_act apache_processes threads mysql_innodb_log mysql_innodb_insert_buf mysql_tmp_tables mysql_innodb_io_pend interrupts apache_volume)
2018/01/09 08:35:01 [DEBUG] Fetching service configuration for 'mysql_commands'
2018/01/09 08:35:01 [DEBUG] Writing to socket: "config mysql_commands
".
2018/01/09 08:35:02 [DEBUG] Reading from socket: "graph_vlabel Commands per ${graph_period}\ngraph_total Questions\ngraph_args --base 1000\ngraph_title Command Counters\ngraph_category mysql2\nCom_delete.type DERIVE\nCom_delete.label Delete\nCom_delete.draw STACK\nCom_delete.min 0\nCom_insert.draw STACK\nCom_insert.min 0\nCom_insert.type DERIVE\nCom_insert.label Insert\nCom_insert_select.type DERIVE\nCom_insert_select.label Insert select\nCom_insert_select.draw STACK\nCom_insert_select.min 0\nCom_load.label Load Data\nCom_load.type DERIVE\nCom_load.min 0\nCom_load.draw STACK\nCom_replace.type DERIVE\nCom_replace.label Replace\nCom_replace.draw STACK\nCom_replace.min 0\nCom_replace_select.draw STACK\nCom_replace_select.min 0\nCom_replace_select.type DERIVE\nCom_replace_select.label Replace select\nCom_select.type DERIVE\nCom_select.label Select\nCom_select.draw STACK\nCom_select.min 0\nCom_update.type DERIVE\nCom_update.label Update\nCom_update.draw STACK\nCom_update.min 0\nCom_update_multi.label Update multi\nCom_update_multi.type DERIVE\nCom_update_multi.min 0\nCom_update_multi.draw STACK".
2018/01/09 08:35:02 [DEBUG] config: 0.151266 sec for 'mysql_commands' on server.com/1.2.3.4/4949
2018/01/09 08:35:02 [DEBUG] Now parsing config output from plugin mysql_commands on server.com
2018/01/09 08:35:02 [DEBUG] update_rate 0 for mysql_commands on server.com/1.2.3.4:4949
2018/01/09 08:35:02 [DEBUG] No service data for mysql_commands, fetching it
2018/01/09 08:35:02 [DEBUG] Writing to socket: "fetch mysql_commands
".
2018/01/09 08:35:02 [DEBUG] data: 0.175058 sec for 'mysql_commands' on server.com/1.2.3.4/4949
2018/01/09 08:35:02 [DEBUG] Now parsing fetch output from plugin mysql_commands on server.com/1.2.3.4:4949
2018/01/09 08:35:02 [FETCH from mysql_commands] # Bad exit
...

两个服务器中的所有 mysql_* 插件都出现了这种情况,但在其他三个服务器上它们工作正常。其他 Munin 插件在所有服务器上工作正常,因此一般配置似乎没问题。

mysql_* 插件的配置:

[mysql*]
user root
env.mysqlopts --defaults-file=/etc/mysql/debian.cnf
env.mysqluser debian-sys-maint
env.mysqlconnection DBI:mysql:mysql;mysql_read_default_file=/etc/mysql/debian.cnf

执行/var/log/munin/munin-node.log时,不会向 Munin 节点日志 () 和 MySQL添加任何新条目。sudo -u munin munin-cron

Munin节点服务报告正常运行:

# systemctl status munin-node
● munin-node.service - Munin Node
   Loaded: loaded (/lib/systemd/system/munin-node.service; enabled)
   Active: active (running) since mié 2018-01-17 14:20:10 CET; 19s ago
     Docs: man:munin-node(1)
           http://munin.readthedocs.org/en/stable-2.0/reference/munin-node.html
  Process: 4515 ExecStart=/usr/sbin/munin-node $DAEMON_ARGS (code=exited, status=0/SUCCESS)
 Main PID: 4541 (munin-node)
   CGroup: /system.slice/munin-node.service
           └─4541 /usr/bin/perl -wT /usr/sbin/munin-node

ene 17 14:20:10 example.com systemd[1]: Started Munin Node.
  • Munin 版本(节点和服务器):Debian 软件包 2.0.25-1+deb8u3。

  • 数据库:mariadb-server Debian 软件包 10.0.26-0+deb8u1。

这些节点可能存在什么问题?或者我该如何调试该问题?

答案1

我真丢脸。

两个 munin-node 配置了相同的 IP,因此一个服务器从未被联系过,而另一个服务器被要求提供两次统计数据。这个双重请求导致 mysql_ plugin 表现不佳。

这个问题解释了我发现的但在这个问题之后排队等待解决的其他小问题(例如,奇怪的磁盘使用情况)。

从 master 调试插件的方式很有趣:

$ nc example.com 4949
# munin node at example.com
fetch mysql_commands
Com_delete.value 5481602
Com_insert.value 3468782
Com_insert_select.value 437696
Com_load.value 0
Com_replace.value 0
Com_replace_select.value 0
Com_select.value 295884041
Com_update.value 8498783
Com_update_multi.value 6
.

调试 Munin 插件

相关内容