从另一个文件替换 csv 文件中的列值

从另一个文件替换 csv 文件中的列值

我有一个 csv 文件,有 85 个字段。我想用另一个文件中的数据替换第 52 列的值。第二个文件仅包含 1 列,并且具有与第一个文件中相同的记录数。

例如data.CSV文件(第一个 csv 文件)

0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111937,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**07822000656**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,
0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111938,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**07822000656**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,
0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111939,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**07822000656**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,
0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111939,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**07822000656**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,
0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111939,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**07822000656**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,
0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111939,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**07822000656**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,

第二个文件(仅包含 1 列)

6228205
6225214
6225211
6225206
5206
87777

我想更换

  • 第 52 列值 ( 07822000656) 来自第一个文件 ( data.csv) 以及文件6228205中的第一行data.csv
  • 文件中第 2 行的第 52 列值 ( 07822000656)6225214data.csv
  • 第 52 列值 ( 07822000656) 与6225211第 3 行

...等等...

所以输出应该是

0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111937,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**6228205**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,
0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111938,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**6225214**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,
0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111939,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**6225211**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,
0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111939,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**6225206**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,
0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111939,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**5206**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,
0,126,,2,0,904CEE,0,0,1,0,0,,7638.raw,0,0,20210515,111939,10,0,540,540,0,,,,,,,,,,,0,,,,,,,,,,,,,0,,,07822000655,,,**87777**,0,,,,0B020D,358605075357339 ,234307822000655,11,,01,00,0,,,0,2,1,0,1101,,1,0,23430,,,11,5,,0A03,,,0,

我设法做到了如下:

awk -F , '{$1, $2, $3, $4...$51}' data.csv >temp1.csv
awk -F , '{$53, $54, $55....$85}' data.csv >temp2.csv
paste -d "," temp1.csv 2nd_file temp2.csv

但是,我正在寻找更好的方法来处理这个问题

答案1

您可以使用awk在第二个文件上构建条目映射,按行号键入并替换第一个文件上的值,

awk -v FS=, -v OFS=, 'FNR==NR{hash[FNR]=$0; next}{$52 = hash[FNR]}1' file2 file1

答案2

您标记了这个问题 /awk,但是awk一旦 csv 文件带有类似 的字段,使用类似的工具就会失败"embed , in a string",所以最好使用为其设计的工具,这甚至使它变得非常简单:

csvtool pastecol 52 1 data.CSV value.CSV

这将第 52 列替换data.CSV为第 1 列value.CSV

答案3

以下是如何使用GoCSV,一个设计用于处理 CSV 的工具。

# Break up starting-file about column 52
gocsv select --columns 1-51 start.csv > left.csv
gocsv select --columns 53-  start.csv > right.csv

# Combine both sides with replacement column/file in the "middle"
gocsv zip left.csv replacement.csv right.csv > my_final.csv

# Prove it worked
cmp my_final.csv op_final.csv 

我确实得打扮一下op_final.csv在进行比较之前,对于任何想要尝试的人:

  • **删除目标值周围的 OP
  • 添加换行符,因为 GoCSV 添加尾随换行符

相关内容