输入文件
jayesh 30,20,50,60 30:20:40,60:55 A AB,KL,CD SM1,SM2
rahul 10,80,50,90 25:55:60,25 SGF AAAA,BCD,RTY SM3,SM4,SM4
pravin 89,78,40,20 25:30:55,96:25 M J SD10,SD12
sarika 10,20,48 29:50:30,25 T K,L SD20,SD39
我想从第 5 列中删除逗号,并在新行中打印逗号后面的单词(注意:- 第五列的每个单元格包含许多逗号,但我只显示几个)
预期产出
jayesh 30,20,50,60 30:20:40,60:55 A AB SM1,SM2
jayesh 30,20,50,60 30:20:40,60:55 A KL SM1,SM2
jayesh 30,20,50,60 30:20:40,60:55 A CD SM1,SM2
rahul 10,80,50,90,120 25:55:60,25 SGF AAAA SM3,SM4,SM4
rahul 10,80,50,90,120 25:55:60,25 SGF BCD SM3,SM4,SM4
rahul 10,80,50,90,120 25:55:60,25 SGF RTY SM3,SM4,SM4
pravin 89,78,40,20 25:30:55,96:25 M J SD10,SD12
sarika 10,20,48 29:50:30,25 T K SD20,SD39
sarika 10,20,48 29:50:30,25 T L SD20,SD39
我使用 awk 尝试了以下操作,但没有给出预期结果。 (为了编写代码,我从这个网站获得帮助如何删除逗号并再次打印整行逗号后面的单词)
awk '{
split ($5,w5,",");
for (i in w5)
{ print $1"\t"$2"\t"$3"\t"$4"\t"w5[i]"\t"$6";}}'
@sundeep,当我对输入文件尝试以下命令时,第 5 列和第 6 列相互混合。(我只在此处显示总共 6 列,但我的文件超过 6 列)
当我在Excel中打开输出文件时获得以下输出
输出
$ awk '{ split ($5,w5,","); for (i in w5) { print $1"\t"$2"\t"$3"\t"$4"\t"w5[i]"\t"$6 } }' ip.txt
jayesh 30,20,50,60 30:20:40,60:55 A "ABSM1,SM2"
jayesh 30,20,50,60 30:20:40,60:55 A KL SM1,SM2
jayesh 30,20,50,60 30:20:40,60:55 A CD" SM1,SM2
rahul 10,80,50,90 25:55:60,25 SGF AAAASM3,SM4,SM4"
rahul 10,80,50,90 25:55:60,25 SGF BCD SM3,SM4,SM4
rahul 10,80,50,90 25:55:60,25 SGF RTY" SM3,SM4,SM4
pravin 89,78,40,20 25:30:55,96:25 M J SD10,SD12
sarika 10,20,48 29:50:30,25 T KSD20,SD39"
sarika 10,20,48 29:50:30,25 T L" SD20,SD39
答案1
awk
OP使用的命令只是有语法问题,打印";
语句的末尾
$ awk '{ split ($5,w5,","); for (i in w5) { print $1"\t"$2"\t"$3"\t"$4"\t"w5[i]"\t"$6 } }' ip.txt
jayesh 30,20,50,60 30:20:40,60:55 A AB SM1,SM2
jayesh 30,20,50,60 30:20:40,60:55 A KL SM1,SM2
jayesh 30,20,50,60 30:20:40,60:55 A CD SM1,SM2
rahul 10,80,50,90 25:55:60,25 SGF AAAA SM3,SM4,SM4
rahul 10,80,50,90 25:55:60,25 SGF BCD SM3,SM4,SM4
rahul 10,80,50,90 25:55:60,25 SGF RTY SM3,SM4,SM4
pravin 89,78,40,20 25:30:55,96:25 M J SD10,SD12
sarika 10,20,48 29:50:30,25 T K SD20,SD39
sarika 10,20,48 29:50:30,25 T L SD20,SD39
另外,可以设置输出字段分隔符以获得更清晰的语法,感谢@fedorqui的建议
awk -v OFS='\t' '{ split ($5,w5,","); for (i in w5) { print $1,$2,$3,$4,w5[i],$6 } }' ip.txt
或者
awk -v OFS='\t' '{ split ($5,w5,","); for (i in w5) { $5 = w5[i]; print } }' ip.txt
类似的解决方案perl
$ perl -lane 'print join "\t", @F[0..3],$_,@F[5..$#F] foreach split /,/,$F[4]' ip.txt
jayesh 30,20,50,60 30:20:40,60:55 A AB SM1,SM2
jayesh 30,20,50,60 30:20:40,60:55 A KL SM1,SM2
jayesh 30,20,50,60 30:20:40,60:55 A CD SM1,SM2
rahul 10,80,50,90 25:55:60,25 SGF AAAA SM3,SM4,SM4
rahul 10,80,50,90 25:55:60,25 SGF BCD SM3,SM4,SM4
rahul 10,80,50,90 25:55:60,25 SGF RTY SM3,SM4,SM4
pravin 89,78,40,20 25:30:55,96:25 M J SD10,SD12
sarika 10,20,48 29:50:30,25 T K SD20,SD39
sarika 10,20,48 29:50:30,25 T L SD20,SD39