Chr Position1 Position2 S1 S2 S3 S4 S9 S11 S14 S15 S16 S17 S18 S19 S28 S29 S30 S33 S34 S35 S36 S37 S38
Aradu.A01 100145549 100145556 AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA
Aradu.A01 100246832 100246837 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Aradu.A01 100246837 100246846 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Aradu.A01 100345681 100345688 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Aradu.A01 100408092 100408119 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Aradu.A01 100408119 100408137 0 0 0 TA TA TA TA TA TA TA TA TA TA TA TA TA TA TA TA 0
Aradu.A01 100425855 100425856 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Aradu.A01 100425856 100425857 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Aradu.A01 100431071 100431075 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Aradu.A01 10051925 10051981 GT GT 0 0 GT 0 0 0 0 0 0 GT GT GT 0 0 0 GT GT 0
Aradu.A01 100616716 100616718 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Aradu.A01 100616718 100616750 0 0 0 0 0 0 CT 0 0 0 0 0 0 0 0 0 0 0 0 0
Aradu.A01 100757232 100757233 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Aradu.A01 100761215 100761271 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Aradu.A01 10078917 10078920 CT 0 0 CC 0 CC CC CC 0 CC CC CC 0 CC 0 0 0 0 CT 0
以下是我尝试过的:
awk '{ for (i=4; i<=NF; i++){if ($i=0) {count++}} print $0"\t"count}' input_file|less -S
答案1
您所拥有的很接近,但只会打印每条记录,然后打印每条记录的累积零数,而不是您所要求的。
awk '{ zeroes=0; for( i=4; i<NF; i++ ) { if( $i == 0 ) {zeroes++} } if( zeroes / (NF-3) < 0.2 ) { print $0 } }' /path/to/input
更易读一点:
{
zeroes=0
for( i=4; i<NF; i++ ) {
if( $i == 0 ) {
zeroes++
}
}
if( zeroes/(NF-3) < 0.2 ) {
print $0
}
}
逻辑应该是相当不言自明的。