搜索模式并将行附加到另一个文件

Question 1

您可以使用已有的代码。将该行存储到数组中并匹配第五个元素：

while read -r line; do
    [ -z "$line" ] && continue
    patlist=($line)
    pat=${patlist[4]}
    grep "$pat" --label="$line" -H < KEGG.annotations
done < allKO.txt

返回：

Metabolism Carbohydrate metabolism Glycolisis K07448:>aai:AARI_33320  mrr; restriction system protein Mrr; K07448 restriction system protein
Metabolism Protein metabolism protesome K02217:>aai:AARI_26600  ferritin-like protein; K02217 ferritin [EC:1.16.3.1]

Answer

您可以使用已有的代码。将该行存储到数组中并匹配第五个元素：

while read -r line; do
    [ -z "$line" ] && continue
    patlist=($line)
    pat=${patlist[4]}
    grep "$pat" --label="$line" -H < KEGG.annotations
done < allKO.txt

返回：

Metabolism Carbohydrate metabolism Glycolisis K07448:>aai:AARI_33320  mrr; restriction system protein Mrr; K07448 restriction system protein
Metabolism Protein metabolism protesome K02217:>aai:AARI_26600  ferritin-like protein; K02217 ferritin [EC:1.16.3.1]

Question 2

这似乎符合您的要求：

while read w1 w2 w3 w4 ID
do
    printf "%s " "$w1 $w2 $w3 $w4 $ID"
    if ! grep "$ID" KEGG.annotations
    then
        echo
    fi
done < allKO.txt

这会将输出写入屏幕。将输出 ( >) 重定向（例如> test1）添加到最后一行以捕获文件中的输出。

根据您的示例，键/ID 字段（“模式”）是第五的五文件中的字段allKO.txt，所以我们read w1 w2 w3 w4 ID.你说这是一个制表符分隔的文件；我假设所有字段都不包含空格。
写入 ( printf) 来自的行（即字段）allKO.txt，末尾有一个空格，但没有终止换行符。
在( grep)KEGG.annotations文件中搜索 ID（来自的行中的第五个字段allKO.txt）。这些将是完整的行（包括换行符）。
如果grep失败，请写一个换行符，因为printf没有。

这将导致 ID 不存在的行KEGG.annotations 被简单地写入输出：

Metabolism Protein metabolism proteasome K02217  >aai:AARI_26600 ferritin-like protein; K02217 ferritin [EC:1.16.3.1]
This ID doesn’t exist: K99999

并且多次存在的 ID 被写入附加行（不重复中的数据allKO.txt）：

Metabolism Protein metabolism proteasome K02217  >aai:AARI_26600 ferritin-like protein; K02217 ferritin [EC:1.16.3.1]
This is a hypothetical additional line from KEGG.annotations that mentions “K02217”.

Answer

这似乎符合您的要求：

while read w1 w2 w3 w4 ID
do
    printf "%s " "$w1 $w2 $w3 $w4 $ID"
    if ! grep "$ID" KEGG.annotations
    then
        echo
    fi
done < allKO.txt

这会将输出写入屏幕。将输出 ( >) 重定向（例如> test1）添加到最后一行以捕获文件中的输出。

根据您的示例，键/ID 字段（“模式”）是第五的五文件中的字段allKO.txt，所以我们read w1 w2 w3 w4 ID.你说这是一个制表符分隔的文件；我假设所有字段都不包含空格。
写入 ( printf) 来自的行（即字段）allKO.txt，末尾有一个空格，但没有终止换行符。
在( grep)KEGG.annotations文件中搜索 ID（来自的行中的第五个字段allKO.txt）。这些将是完整的行（包括换行符）。
如果grep失败，请写一个换行符，因为printf没有。

这将导致 ID 不存在的行KEGG.annotations 被简单地写入输出：

Metabolism Protein metabolism proteasome K02217  >aai:AARI_26600 ferritin-like protein; K02217 ferritin [EC:1.16.3.1]
This ID doesn’t exist: K99999

并且多次存在的 ID 被写入附加行（不重复中的数据allKO.txt）：

Metabolism Protein metabolism proteasome K02217  >aai:AARI_26600 ferritin-like protein; K02217 ferritin [EC:1.16.3.1]
This is a hypothetical additional line from KEGG.annotations that mentions “K02217”.

搜索模式并将行附加到另一个文件

答案1

答案2

相关内容