使用 Bash / awk 填充 csv 文件第一列中的空格

使用 Bash / awk 填充 csv 文件第一列中的空格

我正在考虑自动化一些流程/计算,但我可能首先需要格式化一个有点尴尬的CSV文件集。 (为此bash,我根据要求使用 )。

csv 文件集(大致)遵循以下格式

CODE,Sitting,Jan,Feb,Mar,Apr,May,Jun,Jul,Totals
CLLK_J9,First Sitting,,,2,5,2,,,10
,Second Sitting,,,,,,,,1
RTHM_A8,First Sitting,,,1,,3,,,6
,Second Sitting,,,,,1,,,1
FFBJ_FA9,First Sitting,,,,8,6,,,25
,Second Sitting,,,,,11,,,12
UUYIOR_HJ9,First Sitting,,,1,3,6,,,17
IKRO_Lk8,First Sitting,,,,3,3,,,37
,Second Sitting,,,,6,11,,,34

在此输入图像描述

我试图用上一行中的CODE字段内容填充该列中的空字段(通常这些空字段出现在第 2 列中的“第二次坐”实例旁边)。因此,对于上面的例子,结果应该是这样的

CODE,Sitting,Jan,Feb,Mar,Apr,May,Jun,Jul,Totals
CLLK_J9,First Sitting,,,2,5,2,,,10
CLLK_J9,Second Sitting,,,,,,,,1
etc.

我开始阅读一些awk文档,因为它似乎是完成此任务的相当强大的实用程序 - 但尚未取得任何进展。想法?

答案1

使用米勒(https://github.com/johnkerl/miller)非常简单。跑步

mlr --csv fill-down -f CODE input.csv >output.csv

你将会拥有

+------------+----------------+-----+-----+-----+-----+-----+-----+-----+--------+
| CODE       | Sitting        | Jan | Feb | Mar | Apr | May | Jun | Jul | Totals |
+------------+----------------+-----+-----+-----+-----+-----+-----+-----+--------+
| CLLK_J9    | First Sitting  | -   | -   | 2   | 5   | 2   | -   | -   | 10     |
| CLLK_J9    | Second Sitting | -   | -   | -   | -   | -   | -   | -   | 1      |
| RTHM_A8    | First Sitting  | -   | -   | 1   | -   | 3   | -   | -   | 6      |
| RTHM_A8    | Second Sitting | -   | -   | -   | -   | 1   | -   | -   | 1      |
| FFBJ_FA9   | First Sitting  | -   | -   | -   | 8   | 6   | -   | -   | 25     |
| FFBJ_FA9   | Second Sitting | -   | -   | -   | -   | 11  | -   | -   | 12     |
| UUYIOR_HJ9 | First Sitting  | -   | -   | 1   | 3   | 6   | -   | -   | 17     |
| IKRO_Lk8   | First Sitting  | -   | -   | -   | 3   | 3   | -   | -   | 37     |
| IKRO_Lk8   | Second Sitting | -   | -   | -   | 6   | 11  | -   | -   | 34     |
+------------+----------------+-----+-----+-----+-----+-----+-----+-----+--------+

答案2

如果您想尝试使用awk,则以下内容应该有效:

awk -F',' -v OFS=',' 'FNR>1{if ($1!="") last=$1; else $1=last}1' input.csv

这会将输入和输出字段分隔符设置为,

然后,对于每一行第一行之后( FNR>1),它将检查第一列 ( $1) 是否为空。如果不是,该值将存储在变量中last以供以后使用。如果为空,则用之前存储的值填充。

1规则块的外部指示{ ... }打印awk当前行,包括所做的所有修改。请注意,除非true指定此命令(或实际上评估为 的任何布尔条件),或者在规则块内给出显式print/命令,否则不会打印当前行。printfawk

结果:

CODE,Sitting,Jan,Feb,Mar,Apr,May,Jun,Jul,Totals
CLLK_J9,First Sitting,,,2,5,2,,,10
CLLK_J9,Second Sitting,,,,,,,,1
RTHM_A8,First Sitting,,,1,,3,,,6
RTHM_A8,Second Sitting,,,,,1,,,1
FFBJ_FA9,First Sitting,,,,8,6,,,25
FFBJ_FA9,Second Sitting,,,,,11,,,12
UUYIOR_HJ9,First Sitting,,,1,3,6,,,17
IKRO_Lk8,First Sitting,,,,3,3,,,37
IKRO_Lk8,Second Sitting,,,,6,11,,,34

答案3

不是 AWK 解决方案,但是...

$ cat yourfile.csv | csv-sqlite 'select code, (select i2.code from input i2 where i2.code != "" and i2.rowid <= i1.rowid order by i2.rowid desc limit 1) as new_CODE, Sitting, Jan, Feb, Mar, Apr, May, Jun, Jul, Totals from input i1' -s
CODE         new_CODE     Sitting          Jan   Feb   Mar   Apr   May   Jun   Jul   Totals
CLLK_J9      CLLK_J9      First Sitting                2     5     2                 10
             CLLK_J9      Second Sitting                                             1
RTHM_A8      RTHM_A8      First Sitting                1           3                 6
             RTHM_A8      Second Sitting                           1                 1
FFBJ_FA9     FFBJ_FA9     First Sitting                      8     6                 25
             FFBJ_FA9     Second Sitting                           11                12
UUYIOR_HJ9   UUYIOR_HJ9   First Sitting                1     3     6                 17
IKRO_Lk8     IKRO_Lk8     First Sitting                      3     3                 37
             IKRO_Lk8     Second Sitting                     6     11                34

(如果你删除“-s”,它将以 CSV 形式返回)

csv-sqlite 来自https://github.com/mslusarz/csv-nix-tools

查询被盗自这个答案

答案4

在每个 UNIX 机器上的任何 shell 中使用任何 awk:

$ awk 'BEGIN{FS=OFS=","} $1==""{$1=p} {p=$1} 1' file
CODE,Sitting,Jan,Feb,Mar,Apr,May,Jun,Jul,Totals
CLLK_J9,First Sitting,,,2,5,2,,,10
CLLK_J9,Second Sitting,,,,,,,,1
RTHM_A8,First Sitting,,,1,,3,,,6
RTHM_A8,Second Sitting,,,,,1,,,1
FFBJ_FA9,First Sitting,,,,8,6,,,25
FFBJ_FA9,Second Sitting,,,,,11,,,12
UUYIOR_HJ9,First Sitting,,,1,3,6,,,17
IKRO_Lk8,First Sitting,,,,3,3,,,37
IKRO_Lk8,Second Sitting,,,,6,11,,,34

相关内容