如何根据年份拆分文件。我的文件有 2019 年和 2020 年的数据,下面提到了文件的几行
hash,block_timestamp,addresses
1b8fb81b9c4db4cf3659d2553e7c1d5a4dac21400e331ea3deecdfa45e2eb7d7,2020-05-08 13:43:38 UTC,32UNEwo4UtXrD8xjAVDGapBcWQ9B7HBQNb
daeac50f989f0d31bcc412ca47e2e082f7d0599d8e577a9a310f7ab4e9d474d2,2020-05-08 13:21:33 UTC,3BMEXPLQqB9rkR5SMhJdA4Xm98ntT5xuw8
56777decb012d60f36f9cd4b9acfe13215f670bbe192f261db21e64f98e212be,2019-05-08 13:39:39 UTC,1AMtkH4riMpxSe7YMbs6h2aaDXVdxnmMFy
f5a1d52f013f1ee49a6cad971a5782c1c9905030d35ac28e23a2113fd1941421,2019-04-10 18:36:01 UTC,1LBBNap7kLswvgYbzmfLeskAfEMToiinkB
我试过
awk -F',' '{print >((substr($2,1,4)<=2020)?"2019":"2020")}' combined-out.csv
结果是两个空文件。如何解决这个问题
答案1
看起来您正在使用<=
而不是<
:
BEGIN {
FS = ","
}
{
s1 = substr($2, 1, 4)
if (s1 < 2020) {
print > 2019
} else {
print > 2020
}
}
答案2
要打印两个输出文件中的标题行:
$ awk -F, 'NR>1{print > ($2+0)}' file
$ cat 2019
56777decb012d60f36f9cd4b9acfe13215f670bbe192f261db21e64f98e212be,2019-05-08 13:39:39 UTC,1AMtkH4riMpxSe7s6h2aaDXVdxnmMFy
f5a1d52f013f1ee49a6cad971a5782c1c9905030d35ac28e23a2113fd1941421,2019-04-10 18:36:01 UTC,1LBBNap7kLswvgYfLeskAfEMToiinkB
$ cat 2020
1b8fb81b9c4db4cf3659d2553e7c1d5a4dac21400e331ea3deecdfa45e2eb7d7,2020-05-08 13:43:38 UTC,32UNEwo4UtXrD8xDGapBcWQ9B7HBQNb
daeac50f989f0d31bcc412ca47e2e082f7d0599d8e577a9a310f7ab4e9d474d2,2020-05-08 13:21:33 UTC,3BMEXPLQqB9rkR5JdA4Xm98ntT5xuw8
或在两个输出文件中打印它:
$ awk -F, 'NR==1{hdr=$0; next} {out=($2+0); if (!seen[out]++) print hdr > out; print > out}' file
$ cat 2019
hash,block_timestamp,addresses
56777decb012d60f36f9cd4b9acfe13215f670bbe192f261db21e64f98e212be,2019-05-08 13:39:39 UTC,1AMtkH4riMpxSe7YMbs6h2aaDXVdxnmMFy
f5a1d52f013f1ee49a6cad971a5782c1c9905030d35ac28e23a2113fd1941421,2019-04-10 18:36:01 UTC,1LBBNap7kLswvgYbzmfLeskAfEMToiinkB
$ cat 2020
hash,block_timestamp,addresses
1b8fb81b9c4db4cf3659d2553e7c1d5a4dac21400e331ea3deecdfa45e2eb7d7,2020-05-08 13:43:38 UTC,32UNEwo4UtXrD8xjAVDGapBcWQ9B7HBQNb
daeac50f989f0d31bcc412ca47e2e082f7d0599d8e577a9a310f7ab4e9d474d2,2020-05-08 13:21:33 UTC,3BMEXPLQqB9rkR5SMhJdA4Xm98ntT5xuw8