sed 操作不起作用或者我可能做错了?

sed 操作不起作用或者我可能做错了?

输入文本:

chrX_143483005-chr6_103649292,chrX_143483110-chr6_103649131 chrX_143483110-chr_6103649147   chrX_143483004-chr6_103649293,chrX_143483110-chr6_103649291,chrX_143483110-chr6_103649053
chrX_143483110-chr_6103649147   chrX_143483005-chr6_103649292,chrX_143483110-chr6_103649131 0
0   chrX_143483005-chr6_103649292,chrX_143483110-chr6_103649131 chrX_143482988-chr6_103649147,chrX_143483004-chr6_103649293,chrX_143483110-chr6_103649291,chrX_143483110-chr6_103649053
chrX_143483005-chr6_103649292,chrX_143483110-chr6_103649131 0   chrX_143483110-chr_6103649147
0   chrX_143483005-chr6_103649292,chrX_143483110-chr6_103649131 chrX_143482988-chr6_103649147,chrX_143483004-chr6_103649293,chrX_143483110-chr6_103649291,chrX_143483110-chr6_103649053

期望的输出:

chrX_143483005-chr6_103649292   chrX_143483110-chr_6103649147   chrX_143483004-chr6_103649293
chrX_143483110-chr_6103649147   chrX_143483005-chr6_103649292   0
0   chrX_143483005-chr6_103649292   chrX_143482988-chr6_103649147
chrX_143483005-chr6_103649292   0   chrX_143483110-chr_6103649147
0 chrX_143483005-chr6_103649292 chrX_143482988-chr6_103649147

尝试过:

## No. of Columns in each line.
awk '{print NF}' tt.txt
3
3
3
3
3
## operation to delete the co-ordinates affiliated with comma.
sed -e 's/\,chr[A-Z0-9]\_[0-9]-chr[A-Z0-9]\_[0-9]*.//g' tt.txt

基本上我想删除“,”之后的坐标,并且只想保留左手(第一个)坐标。

注意:1 在此操作中,列将与输入相同。 2. 逗号分隔的坐标不固定,可以是任意列。 3. 染色体可以是1-19、X和Y中的任何一个。

答案1

足够简单:

$ sed -E 's/,[^ ]+//g' in
chrX_143483005-chr6_103649292 chrX_143483110-chr_6103649147   chrX_143483004-chr6_103649293
chrX_143483110-chr_6103649147   chrX_143483005-chr6_103649292 0
0   chrX_143483005-chr6_103649292 chrX_143482988-chr6_103649147
chrX_143483005-chr6_103649292 0   chrX_143483110-chr_6103649147
0   chrX_143483005-chr6_103649292 chrX_143482988-chr6_103649147

(扩展)正则表达式/,[^ ]+/将匹配逗号后跟的非空格字符系列的任何序列。

sed命令s将用第二个参数(在本例中为空)替换第一个参数(在本例中为给定表达式)的任何匹配项;g该命令的选项表示s对找到的所有匹配项进行替换,而不仅仅是第一个匹配项。

相关内容