我有一个这样的文件
>gene1*ENSG24
CTTGGGGGGCTGGGGGCCAGGTGAAAGGGAAATGGAGGGCAGCACCCGCG
AGCCCTCATTGCCTATAGTGGTTTCCATGGCGATCATGTAAGAGTCAATG
TCGTCATTGGCAAAGTCGTCCGGGTGGGGTGTGCTGTAGGCAGAATCGGA
GTATCAGGGAGGGGACTGGGGGAGCAGAGGCAGGGCCCCACCTTGGAGGG
CTCGAAGGGAGCTCTGGGGCCCCCGACCACTGGAGA
>gene2*ENSG87
CCATTTTGAAACCCTTAATAAAAACTTGCTGGTCTGAGACTCAGCAGGCA
GCACAGACTTACTGATATGTACTGTCACCTCCAGCGGCCCAGCTGTAAAA
TTCCTCTCTTTGTAGTGTCTCTCTTTATTTCTCAGCTGGCTGACACTTAT
GGAAAATGGAAAGAACCTATGTTGAAATATTGGGGGCAGGTTCCATCAAT
AGTTCTTACATGG
我想要以下格式的输出
>gene1
CTTGGGGGGCTGGGGGCCAGGTGAAAGGGAAATGGAGGGCAGCACCCGCG
AGCCCTCATTGCCTATAGTGGTTTCCATGGCGATCATGTAAGAGTCAATG
TCGTCATTGGCAAAGTCGTCCGGGTGGGGTGTGCTGTAGGCAGAATCGGA
GTATCAGGGAGGGGACTGGGGGAGCAGAGGCAGGGCCCCACCTTGGAGGG
CTCGAAGGGAGCTCTGGGGCCCCCGACCACTGGAGA
>gene2
CCATTTTGAAACCCTTAATAAAAACTTGCTGGTCTGAGACTCAGCAGGCA
GCACAGACTTACTGATATGTACTGTCACCTCCAGCGGCCCAGCTGTAAAA
TTCCTCTCTTTGTAGTGTCTCTCTTTATTTCTCAGCTGGCTGACACTTAT
GGAAAATGGAAAGAACCTATGTTGAAATATTGGGGGCAGGTTCCATCAAT
AGTTCTTACATGG
我想删除 *ENSG 部分。我怎样才能做到这一点。
答案1
应该足够简单sed
:
sed 's/.ENSG[0-9]*$//'