假设我想通过 ssh 做一些测试
例子
ssh 18.23.24.2 2>/dev/null "smartctl -a /dev/sdb -q silent"
echo $?
1
在这种情况下我们得到退出代码 1
如何知道问题是与 ssh 还是与 smartctl 命令有关?
答案1
来自联机帮助页:
EXIT STATUS
ssh exits with the exit status of the remote command or with 255 if an error occurred.
如果 ssh 命令发生错误,它将返回 255,否则返回远程命令的退出状态。
例子:
$ ssh [email protected]
ssh: Could not resolve hostname not.exists: Name or service not known
$ echo $?
255
smartctl
在你的情况下 1 是命令的退出状态,而不是ssh
编辑:
smartctl 退出状态:
EXIT STATUS
The exit statuses of smartctl are defined by a bitmask. If all is well with the disk, the exit status (return value) of smartctl is 0 (all bits turned off). If a problem occurs, or an error,
potential error, or fault is detected, then a non-zero status is returned. In this case, the eight different bits in the exit status have the following meanings for ATA disks; some of these values
may also be returned for SCSI disks.
Bit 0: Command line did not parse.
Bit 1: Device open failed, device did not return an IDENTIFY DEVICE structure, or device is in a low-power mode (see '-n' option above).
Bit 2: Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure (see '-b' option above).
Bit 3: SMART status check returned "DISK FAILING".
Bit 4: We found prefail Attributes <= threshold.
Bit 5: SMART status check returned "DISK OK" but we found that some (usage or prefail) Attributes have been <= threshold at some time in the past.
Bit 6: The device error log contains records of errors.
Bit 7: The device self-test log contains records of errors. [ATA only] Failed self-tests outdated by a newer successful extended self-test are ignored.
smartctl 上的退出状态 1 意味着位 0 已打开,因为 1=2^0,因此命令行无法解析
答案2
由于许多进程都使用相同的值,因此确定该值很困难。
% ssh 2>/dev/null localhost 'exit 255' ; echo $?
255
% ssh 2>/dev/null nopelocalhost 'exit 0' ; echo $?
255
通过启发式,您可以根据常用的退出代码猜测哪个代码来自哪个程序;这基本上是正确的,除非涉及的项目重叠或发生意外情况。标准错误可能可用也可能不可用,并且退出代码可能会根据程序退出的方式而变化:
% ssh localhost ./segfault ; echo $?
255
% ./segfault ; echo $?
zsh: bus error ./segfault
138
%
$?
因此猜测来自哪里不太可靠。相反,更好的选择可能是设计一种协议来传达比退出状态字提供的信息更多的信息;示例包括 Nagios 或 Ansible,它们不仅仅$?
在确定远程命令如何运行时进行通信。这可以像一行文本一样简单,取决于smartctl
运行方式(或段错误,或...),也可以更复杂,例如带有标准输出、错误、退出状态字和其他此类元数据的 JSON 结构。因此,smartctl
您不是直接运行,而是调用运行smartctl
并解析其输出的包装程序,并在另一端ssh
收集该输出;如果输出不可用,则ssh
或 包装程序出现问题。