我想仅删除“>”符号后的数字和“_”,例如:
>1_CR-B_CR56_t
MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
>2_R-B_R46_t
MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
>3000_N-N274_M
MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV
预期成绩:
>CR-B_CR56_t
MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
>R-B_R46_t
MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
>N-N274_M
MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV
我用过sed "s/>[0-9][_]//g"
,但它也删除了“>”。
答案1
sed
只需对您的命令稍作修改:
sed 's/^>[0-9]\+[_]/>/g'
这s
是 sed代替命令时,它会搜索左侧的字符串并将其替换为右侧的字符串。您无需将其替换为任何内容,而是可以将其替换为>
您想要保留的字符。
^
用于指定匹配仅从换行符的开头开始
此外,还*
用于匹配多个数字。
答案2
awk '{sub(/^>._|^>...._/,">")}1' file
>CR-B_CR56_t
MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
>R-B_R46_t
MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
>N-N274_M
MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV
答案3
command:sed 's/^>[0-9]\{1,9\}\_/>/g' filename
输出
>CR-B_CR56_t
MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
>R-B_R46_t
MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
>N-N274_M
MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV