awk 空间分割问题

awk 空间分割问题

我无法用 awk 在第一个空格后分​​割。

$ grep ">" Supplemental_dataset_07_NbE_CDS.fasta | awk 'BEGIN { FS = "\t" } {print $1}' | head
>NbD053290.1 Partial, glutelin type-B 2-like  (XP_016462855.1)
>NbD053289.2 GDSL esterase/lipase At2g38180-like  (XP_016505556.1)
>NbD053288.1 SUMO-conjugating enzyme SCE1  (XP_019223445.1)
>NbD053287.1 bifunctional epoxide hydrolase 2-like  (XP_016470817.1)
>NbD053286.1 uncharacterized protein LOC109221334 isoform X1  (XP_019241352.1)
>NbD053285.2 uncharacterized protein LOC107817905  (XP_016499316.1)
>NbD053284.3 cell division cycle protein 123 homolog  (XP_019248046.1)
>NbD053283.1 Partial, probable rhamnogalacturonate lyase B  (XP_009789094.1)
>NbD053282.1 aluminum-activated malate transporter 2-like  (XP_009760052.1)
>NbD053281.1 Partial, uncharacterized protein LOC107803999  (XP_016483291.1)

不幸的是,以下命令删除了部分描述:

grep ">" Supplemental_dataset_07_NbE_CDS.fasta | awk 'BEGIN { FS = " " } {print $1","$2}' | head

>NbD053290.1,Partial,
>NbD053289.1,GDSL
>NbD053288.1,SUMO-conjugating
>NbD053287.1,bifunctional
>NbD053286.1,uncharacterized
>NbD053285.1,uncharacterized
>NbD053284.1,cell
>NbD053283.1,Partial,
>NbD053282.1,aluminum-activated
>NbD053281.1,Partial,

如何修改上述命令来创建此输出:

>NbD053290.1,Partial, glutelin type-B 2-like  (XP_016462855.1)
>NbD053289.2,GDSL esterase/lipase At2g38180-like  (XP_016505556.1)

先感谢您,

答案1

这将取代您的整个grep | awk | head管道:

awk '/>/{sub(/ /,","); print; if (++c == 10) exit}' Supplemental_dataset_07_NbE_CDS.fasta

答案2

$ awk -F" " '{ sub(" ",","); print; }' input
>NbD053290.1,Partial, glutelin type-B 2-like  (XP_016462855.1)
>NbD053289.2,GDSL esterase/lipase At2g38180-like  (XP_016505556.1)
>NbD053288.1,SUMO-conjugating enzyme SCE1  (XP_019223445.1)
>NbD053287.1,bifunctional epoxide hydrolase 2-like  (XP_016470817.1)
>NbD053286.1,uncharacterized protein LOC109221334 isoform X1  (XP_019241352.1)
>NbD053285.2,uncharacterized protein LOC107817905  (XP_016499316.1)
>NbD053284.3,cell division cycle protein 123 homolog  (XP_019248046.1)
>NbD053283.1,Partial, probable rhamnogalacturonate lyase B  (XP_009789094.1)
>NbD053282.1,aluminum-activated malate transporter 2-like  (XP_009760052.1)
>NbD053281.1,Partial, uncharacterized protein LOC107803999  (XP_016483291.1)

答案3

使用 Raku(以前称为 Perl_6)

$ raku -pe 's/\s/,/;'
>NbD053290.1,Partial, glutelin type-B 2-like  (XP_016462855.1)
>NbD053289.2,GDSL esterase/lipase At2g38180-like  (XP_016505556.1)

或者

$ raku -pe 's:5th/<.ws>/,/;' glutelin.txt
>NbD053290.1,Partial, glutelin type-B 2-like  (XP_016462855.1)
>NbD053289.2,GDSL esterase/lipase At2g38180-like  (XP_016505556.1)

https://raku.org/

相关内容