我需要根据 record_type (第一个字段)删除尾随逗号。输入文件有 50 个分隔符,我需要根据记录类型减少它们。如果第一个字段为 400,则删除最后 10 个分隔符;如果为 300,则删除 5 个分隔符;如果为 210,则删除 2 个逗号。 400、300 和 210 的模式重复,并且顺序必须保持不变。
例如:
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,,,,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,,,
我需要输出为
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,
我尝试了 awk 和 sed 但它们正在截断整个文件。
答案1
sed可以满足您的要求。这将匹配所需的字符串开头,然后从末尾删除所需数量的逗号。
sed -e '/^400/ s/,\{10\}$//' -e '/^300/ s/,\{5\}$//' -e '/^210/ s/,\{2\}$//'
答案2
AWK 方法。我们定义trunk
函数来打印整行的子字符串,从索引 0 到索引长度 - n 个字符。剩下的就是简单的模式匹配,并调用trunk
函数并删除适当数量的字符。
作为一个班轮:
$ awk -F ',' 'function trunk(n){print substr($0,0,length($0)-n)}; $1==400{trunk(10)};$1==300{trunk(5)};$1==210{trunk(2)} ' input.txt
从脚本来看,这将是这样的:
#!/usr/bin/awk -f
BEGIN { FS="," };
function trunk(n){
print substr($0,0,length($0)-n)
};
$1==400{ trunk(10)};
$1==300{trunk(5)};
$1==210{trunk(2)};
它正在发挥作用:
$ ./trunk_lines.awk input.txt
400,"100.00",,,,"31",,,,"510","410","0102","023",,,,,,,,,,,,,,,,,,,,
300,"110","1",,"2016-04-15",,,"52706","TESTFR1","100.00","1.00",,,"N",,,,
210,"6876262",,"23 Rue du Roule",,,"PARIS","DF","75001","FR",,,,,,,,,,,,,,,,
答案3
鉴于尾随字段为空(或者如果您也想删除它们)
awk -F, -vOFS=, '$1=="400"{NF-=10} $1=="300"{NF-=5} $1=="210"{NF-=2} 1' file
或者如果你想变得聪明(这既可能是好事,也可能是坏事)
awk -F, -vOFS=, 'BEGIN{x[400]=10;x[300]=5;x[210]=2} {NF-=x[$1]} 1' file