根据分隔文件中的第一个字段截断每行的尾随逗号

根据分隔文件中的第一个字段截断每行的尾随逗号

我需要根据 record_type (第一个字段)删除尾随逗号。输入文件有 50 个分隔符,我需要根据记录类型减少它们。如果第一个字段为 400,则删除最后 10 个分隔符;如果为 300,则删除 5 个分隔符;如果为 210,则删除 2 个逗号。 400、300 和 210 的模式重复,并且顺序必须保持不变。

例如:

400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,,,,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,,,

我需要输出为

400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,

我尝试了 awk 和 sed 但它们正在截断整个文件。

答案1

sed可以满足您的要求。这将匹配所需的字符串开头,然后从末尾删除所需数量的逗号。

sed -e '/^400/ s/,\{10\}$//' -e '/^300/ s/,\{5\}$//' -e '/^210/ s/,\{2\}$//' 

答案2

AWK 方法。我们定义trunk函数来打印整行的子字符串,从索引 0 到索引长度 - n 个字符。剩下的就是简单的模式匹配,并调用trunk函数并删除适当数量的字符。

作为一个班轮:

$ awk -F ',' 'function trunk(n){print substr($0,0,length($0)-n)}; $1==400{trunk(10)};$1==300{trunk(5)};$1==210{trunk(2)} ' input.txt 

从脚本来看,这将是这样的:

#!/usr/bin/awk -f

BEGIN { FS="," };

function trunk(n){
    print substr($0,0,length($0)-n)
}; 

$1==400{ trunk(10)};
$1==300{trunk(5)};
$1==210{trunk(2)};

它正在发挥作用:

$ ./trunk_lines.awk input.txt                                                                                            
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,

答案3

鉴于尾随字段为空(或者如果您也想删除它们)

awk -F, -vOFS=, '$1=="400"{NF-=10} $1=="300"{NF-=5} $1=="210"{NF-=2} 1' file 

或者如果你想变得聪明(这既可能是好事,也可能是坏事)

awk -F, -vOFS=, 'BEGIN{x[400]=10;x[300]=5;x[210]=2} {NF-=x[$1]} 1' file

相关内容