过滤掉列中为零的值

过滤掉列中为零的值

我有以下数据集:

071,36,035,08422,46217,00000,1,A,Broadalbin village (pt.),New York,18,46,46,46,45,45,44,46,44,46,43,43
071,36,045,29322,56209,00000,1,A,Glen Park village (pt.),New York,42,42,42,42,43,42,44,43,42,42,41,40
162,36,000,64881,00000,00000,0,A,Saltaire village,New York,37,41,41,41,41,41,41,41,41,41,41,41
157,36,103,64881,00000,00000,0,A,Saltaire village,New York,37,41,41,41,41,41,41,41,41,41,41,41
071,36,103,64881,38000,00000,1,A,Saltaire village,New York,37,41,41,41,41,41,41,41,41,41,41,41
061,36,013,00000,13035,00000,1,F,Cattaraugus Reservation,New York,38,38,38,38,37,37,39,39,38,38,38,40
061,36,029,00000,75011,00000,1,F,Tonawanda Reservation,New York,34,34,34,34,34,34,34,37,37,37,37,36
061,36,009,00000,60950,00000,1,A,Red House town,New York,38,33,33,33,34,34,35,35,34,34,35,35
071,36,065,61115,75280,00000,1,A,Remsen village (pt.),New York,28,28,28,28,28,30,28,27,27,29,29,29
071,36,023,35276,18421,00000,1,A,Homer village (pt.),New York,31,31,31,31,31,30,30,30,30,29,29,29
071,36,099,55816,63440,00000,1,A,Ovid village (pt.),New York,19,27,27,27,27,29,29,26,26,26,26,25
071,36,059,22260,56000,00000,1,A,East Hills village (pt.),New York,23,23,23,23,23,23,23,23,23,23,23,23
071,36,059,47636,34000,00000,1,A,Mineola village (pt.),New York,16,22,22,22,22,22,22,22,22,22,22,22
071,36,043,37275,27199,00000,1,A,Ilion village (pt.),New York,22,23,23,23,23,23,22,22,22,22,22,22
071,36,029,82084,15011,00000,1,A,Williamsville village (pt.),New York,18,13,13,13,13,13,13,13,13,13,13,13
162,36,000,20379,00000,00000,0,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
157,36,103,20379,00000,00000,0,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
071,36,103,20379,66839,00000,1,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
071,36,071,26319,29553,00000,1,A,Florida village (pt.),New York,4,5,5,5,5,5,5,5,5,5,5,5
071,36,099,78553,66333,00000,1,A,Waterloo village (pt.),New York,4,4,4,4,4,4,4,4,4,4,4,4
071,36,057,26924,12122,00000,1,A,Fort Plain village (pt.),New York,3,4,4,4,4,4,4,4,4,4,4,4
071,36,053,31709,23305,00000,1,A,Hamilton village (pt.),New York,5,3,3,3,3,3,3,3,3,3,3,3
061,36,003,00000,54523,00000,1,F,Oil Springs Reservation,New York,1,1,1,1,1,1,1,1,1,1,1,1
157,36,099,28640,00000,00000,0,A,Geneva city (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,099,28640,28640,00000,1,A,Geneva city (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,089,46019,51330,00000,1,A,Massena village (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,071,15297,29553,00000,1,A,Chester village (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,059,28178,53000,00000,1,A,Garden City village (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,053,31709,44435,00000,1,A,Hamilton village (pt.),New York,2,0,0,0,0,0,0,0,0,0,0,0
071,36,051,19664,70101,00000,1,A,Dansville village (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,039,36167,38638,00000,1,A,Hunter village (pt.),New York,A,0,0,0,0,0,0,0,0,0,0,0
071,36,037,18201,19719,00000,1,A,Corfu village (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
061,36,099,00000,28640,00000,0,F,Geneva city,New York,0,0,0,0,0,0,0,0,0,0,0,0
061,36,063,00000,75011,00000,1,F,Tonawanda Reservation,New York,0,0,0,0,0,0,0,0,0,0,0,0
061,36,013,00000,14072,00000,1,S,Chautauqua Lake UT,New York,0,0,0,0,0,0,0,0,0,0,0,0
061,36,009,00000,54523,00000,1,F,Oil Springs Reservation,New York,0,0,0,0,0,0,0,0,0,0,0,0

我想打印出最后一列中数字最小且不为零的 10 行,因此:

071,36,043,37275,27199,00000,1,A,Ilion village (pt.),New York,22,23,23,23,23,23,22,22,22,22,22,22
071,36,029,82084,15011,00000,1,A,Williamsville village (pt.),New York,18,13,13,13,13,13,13,13,13,13,13,13
162,36,000,20379,00000,00000,0,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
157,36,103,20379,00000,00000,0,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
071,36,103,20379,66839,00000,1,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
071,36,071,26319,29553,00000,1,A,Florida village (pt.),New York,4,5,5,5,5,5,5,5,5,5,5,5
071,36,099,78553,66333,00000,1,A,Waterloo village (pt.),New York,4,4,4,4,4,4,4,4,4,4,4,4
071,36,057,26924,12122,00000,1,A,Fort Plain village (pt.),New York,3,4,4,4,4,4,4,4,4,4,4,4
071,36,053,31709,23305,00000,1,A,Hamilton village (pt.),New York,5,3,3,3,3,3,3,3,3,3,3,3
061,36,003,00000,54523,00000,1,F,Oil Springs Reservation,New York,1,1,1,1,1,1,1,1,1,1,1,1

我到目前为止使用的命令是:

cat [filename] | grep 'New York' | sort -n -r -t, -k20 

但我不知道如何做剩下的事情,我确信我需要 grep 来完成

答案1

awk -F, '$10 == "New York" && $NF > 0' input | sort -t, -rnk22 | tail -10

awk 将仅打印第 10 列等于New York且最后一列大于 0 的行。

sort 将以数字方式对第 22 列进行反向排序。

tail -10 将仅打印最后 10 行。

答案2

使用米勒:

$ mlr --csv -N filter '$10 == "New York" && $22 > 0' then sort -nr 22 then tail file

使用gawk

$ awk -F, -v var=10 
'$10 == "New York" && $NF > 0 {ar[$0]=$NF}
END{asorti(ar, b, "@val_num_asc"); 
for(i=var;i>=1;i--)print b[i]}' file

$ csvsql -I -H  --query 'SELECT * FROM file WHERE j == "New York" AND v > 0 ORDER BY CAST(v AS INTEGER) DESC' file | tail

-H由于无头文件而使用。这将创建默认标头(a、b、c..)。

$ csvsql -I -H --query 'SELECT * FROM file WHERE j == "New York" AND v > 0 ORDER BY CAST(v AS INTEGER) DESC LIMIT 10 OFFSET ( SELECT count(*) FROM file WHERE j == "New York" AND v > 0 ) - 10' file | csvformat -K1

相关内容