我有一个列,我想将其子集为特定值(假设 >= 2),然后除以列初始数的总量。我怎样才能做到这一点?
子集 >= 2 的示例:
输入:像这样的列
1
1
1
1
2
2
输出:
2/6=0.33333
我尝试过使用 awk 类似的东西:
awk '($1 > 2) / $1' myfile
但这是行不通的。
答案1
您的示例中没有任何值,> 2
因此我假设您的意思是>= 2
.
awk '$1 >= 2 { t++ } END { print t/NR }' myfile
这将遍历第一列中的每个值,如果该值大于或等于 2,我们将递增变量t
。最后t
将除以记录总数(行数)并打印结果。
如果你想让它从字面上打印方程,你可以这样做:
awk '$1 >= 2 { t ++ } END { print t"/"NR"="t/NR }' myfile
答案2
我们可以使用该dc
实用程序来执行计算:
$ < myfile tr -s ' ' '\t' | cut -f1 |
dc -e "
[lM lN / p q]sq
[lM 1 + sM]sa
[? z0=q lN 1 + sN d2!>a c z0=?]s?
4k 0sN l?x
"
结果:
.3333
简要说明:
° Register `N` holds line count.
° Register `M` holds num of lines >= 2.
° Register `q` performs the division, printing it, and quitting. Kinda like the `END` clause of `awk`.
° Register `a` increments the current value stored in register `M`.
° Register `? ` reads the next line from stdin, checks whether it is empty. In case it us then it initiates the end procedure by invoking the q register. Otw, increments register N the one keeping the line count. Then compares the current line is greater than or equal to 2 . Increments reg M if it is. Then calls itself recursively to redo the same set of operations on the next line.
° 4k will set output accuracy to four digits and 0sN shall initialize the line counter, l?x will set the ball rolling recursively.