使用 shell 脚本向 csv 文件添加或附加新列

使用 shell 脚本向 csv 文件添加或附加新列

我有一个 CSV 文件,其中包含一些列,其中有开始时间和结束时间。我的需求是获取时间之间的差异并将差异添加到新列中。

我得到了时间之间的差异,但无法将其正确地添加到每行的新列中。

这是我的示例 csv。

4,ganesh-28,2019-09-26T16:56:40Z,已关闭,harshavardhanc,2019-09-26T16:57:02Z,1,1 3,ganesh-28,2019-09-26T16:54:25Z,已关闭,harshavardhanc,2019-09-26T16:54:55Z,1,1 2,ganesh-28,2019-09-26T16:52:59Z,已关闭,harshavardhanc,2019-09-26T16:55:19Z,1,1 1,ganesh-28,2019-09-26T16:46:52Z,已关闭,harshavardhanc,2019-09-26T16:47:25Z,1,1

这是脚本。

    #!/bin/bash
    cat a.csv | while read line
       do
              created_at=$(date -d $(echo $line | awk -F "," '{print $3}') +%s)
              merged_at=$(date -d $(echo $line | awk -F "," '{print $6}') +%s)
              echo $created_at $merged_at
              diff=$(( $merged_at - $created_at ))
              h=`expr $diff / 3600`
              m=`expr $diff  % 3600 / 60`
              s=`expr $diff % 60`
              diff=$(printf "%02d:%02d:%02d\n" $h $m $s)
              echo $diff
              awk -v v1="$diff" -F"," 'BEGIN { OFS = "," } {$9=v1; print}' a.csv >> b.csv
done

我得到了类似这样的输出。

4,ganesh-28,2019-09-26T16:56:40Z,closed,harshavardhanc,2019-09-26T16:57:02Z,1,1,00:00:22
3,ganesh-28,2019-09-26T16:54:25Z,closed,harshavardhanc,2019-09-26T16:54:55Z,1,1,00:00:22
2,ganesh-28,2019-09-26T16:52:59Z,closed,harshavardhanc,2019-09-26T16:55:19Z,1,1,00:00:22
1,ganesh-28,2019-09-26T16:46:52Z,closed,harshavardhanc,2019-09-26T16:47:25Z,1,1,00:00:22
4,ganesh-28,2019-09-26T16:56:40Z,closed,harshavardhanc,2019-09-26T16:57:02Z,1,1,00:00:30
3,ganesh-28,2019-09-26T16:54:25Z,closed,harshavardhanc,2019-09-26T16:54:55Z,1,1,00:00:30
2,ganesh-28,2019-09-26T16:52:59Z,closed,harshavardhanc,2019-09-26T16:55:19Z,1,1,00:00:30
1,ganesh-28,2019-09-26T16:46:52Z,closed,harshavardhanc,2019-09-26T16:47:25Z,1,1,00:00:30
4,ganesh-28,2019-09-26T16:56:40Z,closed,harshavardhanc,2019-09-26T16:57:02Z,1,1,00:02:20
3,ganesh-28,2019-09-26T16:54:25Z,closed,harshavardhanc,2019-09-26T16:54:55Z,1,1,00:02:20
2,ganesh-28,2019-09-26T16:52:59Z,closed,harshavardhanc,2019-09-26T16:55:19Z,1,1,00:02:20
1,ganesh-28,2019-09-26T16:46:52Z,closed,harshavardhanc,2019-09-26T16:47:25Z,1,1,00:02:20
4,ganesh-28,2019-09-26T16:56:40Z,closed,harshavardhanc,2019-09-26T16:57:02Z,1,1,00:00:33
3,ganesh-28,2019-09-26T16:54:25Z,closed,harshavardhanc,2019-09-26T16:54:55Z,1,1,00:00:33
2,ganesh-28,2019-09-26T16:52:59Z,closed,harshavardhanc,2019-09-26T16:55:19Z,1,1,00:00:33
1,ganesh-28,2019-09-26T16:46:52Z,closed,harshavardhanc,2019-09-26T16:47:25Z,1,1,00:00:33

即将差异附加到所有行。

但我的要求只是获取该行的时间差。输出应该是这样的。

4,ganesh-28,2019-09-26T16:56:40Z,closed,harshavardhanc,2019-09-26T16:57:02Z,1,1,00:00:22
3,ganesh-28,2019-09-26T16:54:25Z,closed,harshavardhanc,2019-09-26T16:54:55Z,1,1,00:00:30
2,ganesh-28,2019-09-26T16:52:59Z,closed,harshavardhanc,2019-09-26T16:55:19Z,1,1,00:02:20
1,ganesh-28,2019-09-26T16:46:52Z,closed,harshavardhanc,2019-09-26T16:47:25Z,1,1,00:00:33

请有人帮助我实现这一目标。

答案1

循环内的最后一个 awk 命令

awk -v v1="$diff" -F"," 'BEGIN { OFS = "," } {$9=v1; print}' a.csv >> b.csv

在每次循环迭代时处理整个文件a.csv,并将整个结果附加到b.csv每次循环中。

假设您的意图是仅将命令应用于$line变量的当前内容 -bash您可以使用这里是字符串

awk -v v1="$diff" -F"," 'BEGIN { OFS = "," } {$9=v1; print}' <<<"$line" >> b.csv

但是,通常不建议在 shell 循环中逐行处理 CSV 文件 - 您可能需要考虑使用本机提供日期时间处理的实用程序(Perl、Python、GNU Awk)或磨坊主前任。

mlr --csvlite --implicit-csv-header --headerless-csv-output put -S '
  $9 = strftime(strptime($6,"%Y-%m-%dT%H:%M:%SZ") - strptime($3,"%Y-%m-%dT%H:%M:%SZ"),"%T")
' a.csv > b.csv

--implicit-csv-header --headerless-csv-output如果您的 CSV 文件确实有标题,请删除)。

也可以看看

相关内容