如何删除特定位置出现特定图案的线条

如何删除特定位置出现特定图案的线条

我有一个如下所示的文件:

PEBP1_HUMAN Homo sapiens    P30086  PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
                    PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
                    PDB; 2L7W; NMR; -; A=1-187.
                    PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens    P16284  PDB; 2KY5; NMR; -; A=686-738.
                    PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
                    PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN  Homo sapiens    Q9BRX2  PDB; 1X52; NMR; -; A=261-371.
                    PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.
                    PDB; 5LZW; EM; 3.53 A; ii=1-385.
                    PDB; 5LZX; EM; 3.67 A; ii=1-385.
                    PDB; 5LZY; EM; 3.99 A; ii=1-385.
                    PDB; 5LZZ; EM; 3.47 A; ii=1-385.

我想从这个文件中匹配EM;在 之后找到的所有元素PDB; (four letter code); EM;。因此,在此列下X-ray;可以找到NMR;或。对于那些有 的行,请将其删除。是否有一些 bash 命令可用于匹配这些元素并删除这些行?EM;EM;

重要的是,匹配时在 前面放置空格EM,因此请用空格匹配,例如EM;

预期结果是:

PEBP1_HUMAN Homo sapiens    P30086  PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
                    PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
                    PDB; 2L7W; NMR; -; A=1-187.
                    PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens    P16284  PDB; 2KY5; NMR; -; A=686-738.
                    PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
                    PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN  Homo sapiens    Q9BRX2  PDB; 1X52; NMR; -; A=261-371.
                    PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

答案1

awk可以这样做:

awk '{if(!($1=="PDB;"&&$3=="EM;")){print}}' <yourfile

测试当前行的第一列(默认情况下以空格作为分隔符)是否为PDB;且第三列是否为EM;,并且仅当两者不为真时才打印该行。

输出

$ awk '{if(!($1=="PDB;"&&$3=="EM;")){print}}' <test
PEBP1_HUMAN Homo sapiens    P30086  PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
                    PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
                    PDB; 2L7W; NMR; -; A=1-187.
                    PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens    P16284  PDB; 2KY5; NMR; -; A=686-738.
                    PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
                    PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN  Homo sapiens    Q9BRX2  PDB; 1X52; NMR; -; A=261-371.
                    PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

答案2

你可以做这样的事情 - 使用 perl 的段落模式

$ perl -F'\n' -00le 'print join "\n", grep { !/PDB; ....; EM;/ } @F' file
PEBP1_HUMAN Homo sapiens    P30086  PDB; 1BD9; X-ray; 2.05 A; A/B=1-187.
                    PDB; 1BEH; X-ray; 1.75 A; A/B=1-187.
                    PDB; 2L7W; NMR; -; A=1-187.
                    PDB; 2QYQ; X-ray; 1.95 A; A=1-187.

PECA1_HUMAN Homo sapiens    P16284  PDB; 2KY5; NMR; -; A=686-738.
                    PDB; 5C14; X-ray; 2.80 A; A/B=28-229.
                    PDB; 5GEM; X-ray; 3.01 A; A/B=28-232.

PELO_HUMAN  Homo sapiens    Q9BRX2  PDB; 1X52; NMR; -; A=261-371.
                    PDB; 5EO3; X-ray; 2.60 A; A/B=265-385.

相关内容