对列进行子集化并除以其长度

对列进行子集化并除以其长度

我有一个列,我想将其子集为特定值(假设 >= 2),然后除以列初始数的总量。我怎样才能做到这一点?

子集 >= 2 的示例:

输入:像这样的列

1  
1    
1  
1  
2  
2  

输出:

2/6=0.33333  

我尝试过使用 awk 类似的东西:

awk '($1 > 2) / $1' myfile

但这是行不通的。

答案1

您的示例中没有任何值,> 2因此我假设您的意思是>= 2.

awk '$1 >= 2 { t++ } END { print t/NR }' myfile

这将遍历第一列中的每个值,如果该值大于或等于 2,我们将递增变量t。最后t将除以记录总数(行数)并打印结果。

如果你想让它从字面上打印方程,你可以这样做:

awk '$1 >= 2 { t ++ } END { print t"/"NR"="t/NR }' myfile

答案2

我们可以使用该dc实用程序来执行计算:

$ < myfile  tr -s ' ' '\t' | cut -f1 |
 dc -e "
   [lM lN / p q]sq
   [lM 1 + sM]sa
   [? z0=q lN 1 + sN d2!>a c z0=?]s? 
   4k 0sN l?x
 "

结果:

.3333

简要说明:

° Register `N` holds line count.
° Register `M` holds num of lines >= 2.
° Register `q` performs the division, printing it, and quitting. Kinda like the `END` clause of `awk`.
° Register `a` increments the current value stored in register `M`.
° Register `? ` reads the next line from stdin, checks whether it is empty. In case it us then it initiates the end procedure by invoking the q register. Otw, increments register N  the one keeping the line count. Then compares the current line is greater than or equal to 2 . Increments reg M if it is. Then calls itself recursively to redo the same set of operations on the next line.
° 4k will set output accuracy to four digits and 0sN shall initialize the line counter, l?x will set the ball rolling recursively. 

相关内容