您好,我需要过滤大于等于 85% 的线。
我需要根据前进:得分线和该线百分比的结束进行过滤。
示例文件:
Scores for this hit:
>ocu-miR-191-5p NC_013669.1 181.00 -24.77 2 22 123304956 123304978 20 85.00% 85.00%
Forward: Score: 181.000000 Q:2 to 22 R:190316850 to 190316872 Align Len (20) (85.00%) (85.00%)
Query: 3' gtCGACGAAAACCCTAAGGCAAc 5'
|||||| | ||| |||||||
Ref: 5' gtGCTGCTATAGGGTTTCCGTTc 3'
Energy: -24.910000 kCal/Mol
Scores for this hit:
>ocu-miR-191-5p NC_013669.1 175.00 -23.66 2 21 163478767 163478790 20 85.00% 85.00%
Forward: Score: 173.000000 Q:2 to 22 R:1340814 to 1340836 Align Len (20) (80.00%) (80.00%)
Query: 3' gtCGACGAAAACCCTAAGGCAAc 5'
||||| | ||| |||||||
Ref: 5' caGCTGCCTGCGGGCTTCCGTTa 3'
Energy: -27.510000 kCal/Mol
这结果应该是这样的:
>ocu-miR-191-5p NC_013669.1 181.00 -24.77 2 22 123304956 123304978 20 85.00% 85.00%
Forward: Score: 181.000000 Q:2 to 22 R:190316850 to 190316872 Align Len (20) (85.00%) (85.00%)
Query: 3' gtCGACGAAAACCCTAAGGCAAc 5'
|||||| | ||| |||||||
Ref: 5' gtGCTGCTATAGGGTTTCCGTTc 3'
Energy: -24.910000 kCal/Mol
我尝试了这些命令,
awk '{if($14>=85)print$_}' < output.txt
awk '$14 >= 85' output.txt
awk - F%) '$14 >= 85' output.txt
我在这个网站上搜索到与我类似的问题,但这些问题并没有解决我的问题,我尝试的命令行也采用了另一行,因为该行也包含 85% 的分数。。你能帮帮我吗?我刚开始使用 ubuntu,我不太了解...
答案1
尝试这个:
awk '{RS=">"; FS=" "; original_block=$0; gsub(/\(|\)|%/,""); if ($25 >= 85) print original_block}' example.txt > output.txt
解释:
RS=">"
将把输入视为以 开头的块>
。FS=" "
将设置字段分隔符为空格和制表符以获得正确的字段数。original_block=$0
()
将保存带有括号和百分号的原始块%
以供稍后打印。gsub(/\(|\)|%/," ")
将删除()
和,%
以便号码可以匹配。if ($25 >= 85)
检查该块中第 25 个字段的数字值是否大于或等于85
。print original_block
以原始格式打印整个匹配的块。