如何从 /usr/share/man/man1 中的联机帮助页中正确提取所有命令概要?

如何从 /usr/share/man/man1 中的联机帮助页中正确提取所有命令概要?

我正在尝试提取所有命令概要/usr/share/man/man1使用手册页:

#!/usr/bin/env bash
## synopses - extract all synopses in /usr/share/man/man1

cd /usr/share/man/man1
for i in *.gz; do
    echo "$i:" | sed -E "s/.1.gz|.gz//g"
    man "./$i" | sed -n '/^SYNOPSIS/,/^[A-Z][A-Z][A-Z]/p' | sed -e '1d; $d' | tr -s [:space:]
done

...这提供了一些成功的衡量标准 - 我从以下位置获得命令的完整输出Az。但我也遇到了很多错误标准错误同时使用for i in ./*.gz; do man "$i"for i in *.gz; do man "./$i"当我输出到文件时( synopses > file) 1 :

<standard input>:27: expected `;' after scale-indicator (got `o')
<standard input>:29: expected `;' after scale-indicator (got `o')
<standard input>:283: name expected (got `\{'): treated as missing
<standard input>:674: warning: macro `as',' not defined (possibly missing space after `as')
<standard input>:174: name expected (got `\{'): treated as missing
<standard input>:161: warning [p 1, 5.5i]: can't break line
<standard input>:594: warning [p 5, 3.8i, div `an-div', 0.0i]: can't break line
<standard input>:569: warning [p 6, 0.0i]: can't break line
<standard input>:147: warning [p 1, 1.8i]: can't break line
<standard input>:205: warning [p 2, 0.2i]: can't break line
<standard input>:525: warning [p 5, 4.5i]: can't break line
<standard input>:157: warning [p 1, 4.8i]: can't break line
<standard input>:351: warning [p 3, 1.8i, div `an-div', 0.0i]: can't break line
<standard input>:147: a space character is not allowed in an escape name
man: can't open man1/zshmisc.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshexpn.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshparam.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshoptions.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshbuiltins.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshzle.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshcompwid.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshcompsys.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshcompctl.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshmodules.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshcalsys.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshtcpsys.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshzftpsys.1: No such file or directory
man: -:423: warning: failed .so request
man: can't open man1/zshcontrib.1: No such file or directory
man: -:423: warning: failed .so request
<standard input>:423: can't open `man1/zshmisc.1': No such file or directory
<standard input>:424: can't open `man1/zshexpn.1': No such file or directory
<standard input>:425: can't open `man1/zshparam.1': No such file or directory
<standard input>:426: can't open `man1/zshoptions.1': No such file or directory
<standard input>:427: can't open `man1/zshbuiltins.1': No such file or directory
<standard input>:428: can't open `man1/zshzle.1': No such file or directory
<standard input>:429: can't open `man1/zshcompwid.1': No such file or directory
<standard input>:430: can't open `man1/zshcompsys.1': No such file or directory
<standard input>:431: can't open `man1/zshcompctl.1': No such file or directory
<standard input>:432: can't open `man1/zshmodules.1': No such file or directory
<standard input>:433: can't open `man1/zshcalsys.1': No such file or directory
<standard input>:434: can't open `man1/zshtcpsys.1': No such file or directory
<standard input>:435: can't open `man1/zshzftpsys.1': No such file or directory
<standard input>:436: can't open `man1/zshcontrib.1': No such file or directory

这些<standard input>错误是什么(逃逸的东西?)以及为什么man最终找不到一些文件?我怎样才能使其更加稳健/高效?


1. 似乎有错误标准错误无论我对相同数据使用何种实现/解决方案,它们都是相同的。这是惊人的。

答案1

你不能只是跑man foo.gz看起来像你运行man foo.1.gz但使用-l似乎更干净。从man man

   -l, --local-file
          Activate `local' mode.  Format and display  local  manual  files
          instead  of  searching  through  the system's manual collection.
          Each manual page argument will be interpreted as an nroff source
          file in the correct format.  No cat file is produced.  If '-' is
          listed as one of the arguments, input will be taken from  stdin.
          When  this  option  is  not used, and man fails to find the page
          required, before displaying the error message,  it  attempts  to
          act as if this option was supplied, using the name as a filename
          and looking for an exact match.

所以,你的脚本应该是这样的:

#!/usr/bin/env bash
## synopses - extract all synopses in /usr/share/man/man1

## No need to cd into the directory, you can just use globs     
for i in /usr/share/man/man1/ajc*.gz; do
    ## This will print the name of the command.      
    basename "${i//.1.gz}"
    man -l "$i"  | 
       awk '/^SYNOPSIS/{a=1; getline}
            (/^[a-zA-z0-9_]/ && a==1){a=0} 
            (a==1 && /./){print}' | tr -s [:space:]

done

我给出的命令awk比您的方法效果更好(例如测试它man ajc),现在也适用于多行概要。您看到的大多数错误都是无关紧要的,其他错误是由于您处理文件名的方式造成的。让我知道这个效果是否更好。

答案2

关于您遇到的错误,这些都在这里解决:

man man

MANWIDTH- 如果$MANWIDTH设置,则其值用作手册页应格式化的行长度。如果未设置,则手册页将采用适合当前终端的行长度进行格式化(如果可用,则使用 ioctl(2)、 的值$COLUMNS,或者如果两者都不可用,则回退到 80 个字符)。仅当可以使用默认格式时,即终端行长度在 66 到 80 个字符之间时,才会保存 Cat 页面。

MAN_KEEP_FORMATTING- 通常,当输出未定向到终端(例如文件或管道)时,格式字符将被丢弃,以便无需特殊工具即可轻松读取结果。但是,如果$MAN_KEEP_FORMATTING设置为任何非空值,则保留这些格式字符。这对于能够解释格式化字符的 man 包装器可能很有用。

MAN_KEEP_STDERR- 通常,当输出定向到终端(通常是寻呼机)时,用于生成手册页格式化版本的命令的任何错误输出都将被丢弃,以避免干扰寻呼机的显示。诸如此类的程序groff经常会产生有关印刷问题(例如对齐不良)的相对较小的错误消息,这些错误消息与手册页一起显示时既难看又通常令人困惑。但是,有些用户无论如何都想看到它们,因此,如果$MAN_KEEP_STDERR设置为任何非空值,错误输出将照常显示。

现在关于如何做另一件事:

我认为这符合你的要求:

for f in /usr/share/man/man1/*gz ; do
    man -P "sed -ne '1,/^[Nn]/d;/^ /{H;b}
    /^[Ss]..[Yy]..[Nn]/{g;:n
    N;/\n\(\n\)[^ ].*/!bn;s//\1/
    s/.\x08//g;s/\(\n\)  */\1/g;
    w /dev/stderr' -ne '};/./q'" -l "$f"
done 2>~/file

它指定sedPAGER仅输出下面的行姓名以及以下人员概要直到遇到任何其他以非<space>。它打印没有什么如果第一行不是以<space>下面的开头姓名不匹配开始[Ss][Yy][Nn]。在每种情况下,它都会在遇到以下情况的第二行完全停止读取文件姓名不以 开头<space>。它从输出中清除前导斜杠<spaces>和所有反斜杠。\b

for刚才在循环中运行了它,它只man用了一分钟就遍历了我的整个库。

man根据是否写入终端或管道/文件来调整其输出。因此,如果你告诉它这样做,它就会完全放弃寻呼机。这是出乎意料的。但我欺骗了它并使用 sed 的write 函数将其写入 >&2 并重定向它,所以它并不明智。

但需要注意的是,@terdon 的可能是更好的方法。虽然您可以更轻松地定制它,因为您获得了sed每个文件,并且格式设置更好一点,因为它不会尝试适应终端宽度,但显然man不会将这些 \backslashes 写入|pipe.

相关内容