如何使用 awk 或 perl 从字符串中删除特定模式?

如何使用 awk 或 perl 从字符串中删除特定模式?

[gene=xyzI]如果我有多个这样的条目,如何仅删除模式:

>lcl|NZ_CP018664.1_gene_628 [gene=mscL] [locus_tag=AUO97_RS03160] [location=complement(694895..695326)]

我希望我的输出是:

>lcl|NZ_CP018664.1_gene_628 [locus_tag=AUO97_RS03160] [location=complement(694895..695326)]

答案1

对于简单替换 -sed就足够了:

sed -E 's/\[gene=[a-z]{3}[A-Z]\] *//' file

输出:

>lcl|NZ_CP018664.1_gene_628 [locus_tag=AUO97_RS03160] [location=complement(694895..695326)]

修改文件“到位”- 添加-i选项:sed -i ....

答案2

GNU awk

$ echo '>lcl|NZ_CP018664.1_gene_628 [gene=mscL] [locus_tag=AUO97_RS03160] [location=complement(694895..695326)]'  | awk '{$0=gensub(/\s*\S+/,"",2)}1'
>lcl|NZ_CP018664.1_gene_628 [locus_tag=AUO97_RS03160] [location=complement(694895..695326)]

也可以通过以下方式完成cut

$ echo '>lcl|NZ_CP018664.1_gene_628 [gene=mscL] [locus_tag=AUO97_RS03160] [location=complement(694895..695326)]'  | cut -d' ' -f1,3-
>lcl|NZ_CP018664.1_gene_628 [locus_tag=AUO97_RS03160] [location=complement(694895..695326)]

相关内容