使用 bash 脚本从 .CSV 文件打印多个值

使用 bash 脚本从 .CSV 文件打印多个值

所以我的目标是从 .csv 文件打印多个值。我正在尝试找到一种方法来尽快完成此任务,并以尽可能短的运行脚本时间。

例如,我有一个名为“test.csv”的文件。在“test.csv”中我有以下值:

0,1673466134,875601111928832,3336977422,22610058C2740,2020-06-03,19:00:01,103,456123489478512
0,6987507655,226102200333225,2312147777,226102E1858F0,2020-06-02,19:00:04,102,112323548998726
0,7891328975,250423212127644,7421354899,22610058C5350,2020-06-01,19:00:00,103,123123489784238
1,1324654889,784502311776287,4778994563,22610058C351E,2020-06-09,19:00:01,102,489123478941324
0,1231324474,247122410577385,1232498779,22610058C53A0,2020-06-07,19:00:00,104,123498715234789
1,4471222598,226912478523771,4123487987,226102C242C40,2020-06-04,19:00:00,103,789123418971354

我需要打印以下值:

ex:计算第一列中所有为“1”的值我会这样做:

cat test1.csv | awk -F ','  '{print $1}' | awk '/^1/' | wc -l

ex :对第 8 列中第 1 列 = 1 的所有值求和

cat test1.csv | awk -F ','  '{print $1,$8}' | awk '/^1/' | awk '{sum+=$2} END {print sum}'

而这样的例子不胜枚举。我有大约 11 个命令需要运行,就像上面的命令一样。我的目标是将所有这些命令包含在脚本文件中,并尽快执行它们。

我制作了一个如下所示的脚本:

#!/bin/bash
while IFS=, read col_1 col_2 col_3 col_4 col_5 col_6 col_7 col_8 col_9
do
        echo "No of lines containing 0 on the 1st column: "
           awk -F ','  '{print $1}' | awk '/^0/' | wc -l
        echo "No of lines containing 1 on the 1st column:"
           awk -F ','  '{print $1}' | awk '/^1/' | wc -l
done < test.csv

我遇到的问题是,执行第一个命令后,无论我在做什么,第二个命令都显示“0”。

有人可以帮我解决这个问题吗?谢谢你!

答案1

好吧,首先,你不想这样做。 awk 比 shell 快几个数量级,因此将 awk 脚本转换为 shell 脚本没有任何好处!忘记 shell,只需在 awk 中完成所有操作。将此文件另存为foo.awk

#!/bin/awk -f
BEGIN{
  FS=","
}
{
  if($1~/^0/){zeros++}
  if($1~/^1/){ones++}
}
END{
  printf "No of lines containing 0 on the 1st column: %d\n", zeros;
  printf "No of lines containing 1 on the 1st column: %d\n", ones;
}

使用以下命令使文件可执行chmod a+x foo.awk,然后运行它:

/path/to/foo.awk /path/to/test.csv

如果我在您的示例数据上运行它,我会得到:

$ foo.awk test.csv 
No of lines containing 0 on the 1st column: 4
No of lines containing 1 on the 1st column: 2

要在第二个示例中包含该命令,请执行以下操作:

#!/bin/awk -f
BEGIN{
  FS=","
}
{
  if($1~/^0/){zeros++}
  if($1~/^1/){ones++; sum8+=$8}
}
END{
  printf "No of lines containing 0 on the 1st column: %d\n", zeros;
  printf "No of lines containing 1 on the 1st column: %d\n", ones;
  printf "Sum of all 8th fields where the 1st field starts with 1: %d\n", sum8
}

如果出于某种原因必须使用 shell 脚本,则让 shell 脚本运行 awk,而不执行其他操作。不要尝试在 shell 中拆分输入,这很复杂而且非常慢。像这样的事情要好得多:

#!/bin/bash
awk -F"," '($1~/^0/){zeros++}
           ($1~/^1/){ones++}
           END{ 
                printf "No of lines containing 0 on the 1st column: %d\n", zeros;
                printf "No of lines containing 1 on the 1st column: %d\n", ones;
           }' "$1"

最后,如果您确实想将其保留为单独的命令,您可以执行类似的操作,但它会非常慢,因为它需要多次读取文件:

#!/bin/bash

echo "No of lines containing 0 on the 1st column: "
awk -F ','  '{print $1}' "$1" | awk '/^0/' | wc -l
echo "No of lines containing 1 on the 1st column:"
awk -F ','  '{print $1}' "$1" | awk '/^1/' | wc -l
echo "Sum of all the 8th columns where the 1st column starts with 1:"
awk -F ','  '/^1/{sum+=$8} END {print sum}' "$1"

然后,您可以使该文件可执行 ( chmod a+x /path/to/foo.sh) 并按如下方式运行它:

/path/to/foo.sh /path/to/test.csv

相关内容