我一直在尝试设置一个在 gitbash 上运行的脚本,以处理 CSV 文件,以便每条记录都有一个特定值,最后一个字段从空 ( ) 更改""
为从 1 到 16 迭代的值。此外,更新值前面有一些文本。
对于 CSV 文件中匹配的每条记录,该字段应类似于REP0001
,最终为REP0100
,然后重新开始。REP0001
以下是输入文本的示例:
"00:30:00","01:00:00","10/14/2014","RETURN","PASADENA","TX","12:30:00","sedan","","","corporate","CO01353"
"01:00:00","01:30:00","10/14/2014","RENT OUT","HOUSTON","TX","00:30:00","sedan","","","personal",""
第一行示例,我不想更改,但仍将其包含在输出中。第二行示例,我想将最后一个字段从 更改为以 开头并迭代到 的""
值,然后从 开始。REP0001
REP0100
REP0001
以下是所需文本的示例:
"01:00:00","01:30:00","10/14/2014","RENT OUT","HOUSTON","TX","00:30:00","sedan","","","personal","REP0001"
我确实尝试过sed
,但awk
我不是脚本编写专家。我只能将查找具有我想要的值的记录的部分放在一起,然后插入我想要的值。但我不知道如何进行迭代魔法:
awk 'BEGIN{FS=",";OFS=","} $4 ~ /"RENT OUT"/ {$12="\042""REP0001""\042"}1' Rentals.csv > output
有人能指出我正确的方向吗?该文件本身大约有 2000 行。
答案1
我相信这可以满足您的要求:
$ awk 'BEGIN{FS=",";OFS=","} $4 ~ /"RENT OUT"/ {NF--;printf $0; x=x%100;x++; printf ",\"REP%04i\"\n",x;next} 1' rentals.csv
"00:30:00","01:00:00","10/14/2014","RETURN","PASADENA","TX","12:30:00","sedan","","","corporate","CO01353"
"01:00:00","01:30:00","10/14/2014","RENT OUT","HOUSTON","TX","00:30:00","sedan","","","personal","REP0001"
唯一改变的部分是这个命令:
$4 ~ /"RENT OUT"/ {NF--;printf $0; x=x%100;x++; printf ",\"REP%04i\"\n",x;next}
一次取一件新作品:
NF--
这将从行中删除最后一个(空白)字段。
`printf $0
这将打印该行(现在没有最后一个字段)。
x=x%100;x++
计数器以
x
100 为模,然后加 1。这样,计数器将从 1 循环到 100,然后再回到 1。printf ",\"REP%04i\"\n",x
这将打印我们新的包含计数器的最后一个字段。
next
由于我们已经打印了这一行,因此我们告诉
awk
跳过其余命令并从该next
行开始。
答案2
另一个(更紧凑的)版本sprintf
:
awk 'BEGIN{FS=OFS=","} $4 ~ /"RENT OUT"/ {$12=sprintf("\"REP%04i\"",++i);i=i%100}1'