计算列的平均值和标准差

2024-5-31 • tag-icon

我的文件中有以下几行，

echo "Start 2A25.20080401.59125.7.HDF 2831 3230"
echo "dimensions 9248 49"
echo "New Cell"
grep "3065,46" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 28.09 17.2412 78.2198 210 1.83619 6 6
grep "3066,46" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 42.31 17.2616 78.252 210 9.86289
grep "3066,47" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 30.94 17.3031 78.2253 210 2.13253
grep "3067,46" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 31.67 17.2821 78.2842 210 2.93917
echo "New Cell"
grep "3067,17" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 35.32 16.1507 78.9842 210 2.19602 7 7
grep "3067,18" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 35.69 16.1895 78.961 210 6.56008
grep "3067,19" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 33.49 16.2281 78.9379 210 5.46735
grep "3068,16" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 27.31 16.1322 79.0394 210 2.16296
grep "3068,17" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 42.16 16.1711 79.0163 200 4.16615
grep "3068,18" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 48.1 16.2099 78.9931 210 24.642
grep "3068,19" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 49.15 16.2485 78.97 210 29.2187
grep "3069,17" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 33.98 16.1914 79.0484 210 3.68008
grep "3069,18" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 35.04 16.2302 79.0252 210 4.7225
grep "3069,19" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 34.08 16.2688 79.0021 210 6.04774
echo "New Cell"
grep "3069,09" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 31.43 15.8757 79.2349 210 3.33878 8 8
grep "3070,09" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 22.61 15.896 79.2669 292 1.05899
echo "New Cell"
grep "3071,15" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 37.63 16.154 79.159 210 1.20265 9 9
grep "3071,16" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 38.84 16.1932 79.1357 210 7.35424
grep "3072,14" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 24.34 16.1352 79.2142 210 0.616139
grep "3072,15" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 43.13 16.1743 79.1911 210 7.59137
grep "3072,16" ../TextFilesDir/out.2A25.20080401.59125.7.HDF.txt.text = 40.94 16.2135 79.1678 210 14.7196

有多个echo“New Cell”、Start、dimension语句的seq。

我必须取第 10 列的平均值，即 1.823、9.36 等。我尝试过

awk '{ ave+= $10; n=n+1} END { if (n > 0) print ave / n; }'

它将整个文件的第 10 列相加并给出一个值。

但我想要回声“新单元格”线之间的平均值。

The expected out put should be.

echo "New Cell" Count =4,average = 3.8 (例如，2条echo "new Cell"行之间有4行，取这4行中第10列的平均值)

echo "New Cell"

计数= 10，平均值= 6（例如，2个回显“新单元格”行之间有10行，并且要取这10行中第10列的平均值）

等等。如何在awk中添加if语句。

答案1

Awk解决方案：

awk '/echo "New Cell"/{    # on encountering line with `echo "New Cell"` 
         if (sum) {        # if calculated sum exists
             c = NR - n;   # get number of processed records
             printf "%s Count = %d, average = %.2f\n", $0, c, sum/c;
             sum = 0
         } 
         n = NR + 1        # get the initial record position/number
     }
     n{ sum += $9 }' file

输出：

echo "New Cell" Count = 4, average = 4.19
echo "New Cell" Count = 10, average = 8.89
echo "New Cell" Count = 2, average = 2.20

答案1

相关内容