删除最后一次出现“pattern2”之后第一次出现“pattern1”的行吗？

Question 1

这是一条ex单线。（ex是的前身和脚本形式vi。）

printf '%s\n' '$?pattern2?/pattern1/d' x | ex file.txt

保存x并退出。%p如果您只想打印更改后的文件，请将其更改为不是保存更改（有利于测试）。

$表示文件的最后一行；是一个地址，表示从当前位置开始?pattern2?向后搜索的第一个结果；为正向搜索地址，为删除行命令。pattern2/pattern1/d

ex当您需要向前和向后寻址时使用。

vi您可以在Vim中以交互方式执行相同的操作：

vim file.txt

然后，输入

:$?pattern2?/pattern1/d

并按 Enter 键。

然后保存并按:xEnter 退出。

Answer

这是一条ex单线。（ex是的前身和脚本形式vi。）

printf '%s\n' '$?pattern2?/pattern1/d' x | ex file.txt

保存x并退出。%p如果您只想打印更改后的文件，请将其更改为不是保存更改（有利于测试）。

$表示文件的最后一行；是一个地址，表示从当前位置开始?pattern2?向后搜索的第一个结果；为正向搜索地址，为删除行命令。pattern2/pattern1/d

ex当您需要向前和向后寻址时使用。

vi您可以在Vim中以交互方式执行相同的操作：

vim file.txt

然后，输入

:$?pattern2?/pattern1/d

并按 Enter 键。

然后保存并按:xEnter 退出。

Question 2

这里有一个暴力方法。读取数据并循环两次。第一次查找最后一次出现的pattern2，第二次查找第一次出现的pattern1。

#!/usr/bin/perl

# usage:  perl remove-pattern.pl [file]
use strict;

# reads the contents of the text file completely
# removes end of line character and spurious control-M's
sub load {
   my $file = shift;
   open my $in, "<", $file or die "unable to open $file : $!";
   my @file_contents = <$in>;
   foreach ( @file_contents ) { 
      chomp; 
      s/\cM//g; 
   }
   return @file_contents;
}

#  gets the first file from the command line
#  after the perl script
my $ifile = shift;

# read the text file
my @file_contents = &load($ifile);

# set 2 variables for the index into the array 
my $p2 = -1;
my $p1 = -1;

# loop through the file contents and find the last
# of pattern2 (could go reverse the data and find the 
# first of pattern2
for( my $i = 0;$i < @file_contents; ++$i ) {
   if( $file_contents[$i] =~ /pattern2/) {
      $p2 = $i 
   } 
}

# start at the location of the last of pattern2
# and find the first of pattern1
for( my $i = $p2; $i < @file_contents; ++$i ) {
   if($file_contents[$i] =~ /pattern1/) {
     $p1 = $i ;
     last;
   }
}

# create an output file name
my $ofile = $ifile . ".filtered";

# open the output file for writing
open my $out, ">", $ofile or die "unable to open $ofile : $!"; 

# loop through the file contents and don't print the index if it matches
# p1.  print all others
for( my $i = 0;$i < @file_contents; ++$i ) {
   print $out "$file_contents[$i]\n" if ($i != $p1);
}


--- data.txt  ---
bla bla
pattern2
bla
pattern1
pattern2
bla
bla pattern1 bla
bla
pattern1

如果上面的 perl 脚本被命名为“remove-pattern.pl”，则在给定 data.txt 输入文件的情况下，将使用以下命令执行该脚本。 %> perl 删除-pattern.pl data.txt

生成的输出文件“data.txt.filtered”

--- data.txt.filtered ---
bla bla
pattern2
bla
pattern1
pattern2
bla
bla
pattern1

Answer

这里有一个暴力方法。读取数据并循环两次。第一次查找最后一次出现的pattern2，第二次查找第一次出现的pattern1。

#!/usr/bin/perl

# usage:  perl remove-pattern.pl [file]
use strict;

# reads the contents of the text file completely
# removes end of line character and spurious control-M's
sub load {
   my $file = shift;
   open my $in, "<", $file or die "unable to open $file : $!";
   my @file_contents = <$in>;
   foreach ( @file_contents ) { 
      chomp; 
      s/\cM//g; 
   }
   return @file_contents;
}

#  gets the first file from the command line
#  after the perl script
my $ifile = shift;

# read the text file
my @file_contents = &load($ifile);

# set 2 variables for the index into the array 
my $p2 = -1;
my $p1 = -1;

# loop through the file contents and find the last
# of pattern2 (could go reverse the data and find the 
# first of pattern2
for( my $i = 0;$i < @file_contents; ++$i ) {
   if( $file_contents[$i] =~ /pattern2/) {
      $p2 = $i 
   } 
}

# start at the location of the last of pattern2
# and find the first of pattern1
for( my $i = $p2; $i < @file_contents; ++$i ) {
   if($file_contents[$i] =~ /pattern1/) {
     $p1 = $i ;
     last;
   }
}

# create an output file name
my $ofile = $ifile . ".filtered";

# open the output file for writing
open my $out, ">", $ofile or die "unable to open $ofile : $!"; 

# loop through the file contents and don't print the index if it matches
# p1.  print all others
for( my $i = 0;$i < @file_contents; ++$i ) {
   print $out "$file_contents[$i]\n" if ($i != $p1);
}


--- data.txt  ---
bla bla
pattern2
bla
pattern1
pattern2
bla
bla pattern1 bla
bla
pattern1

如果上面的 perl 脚本被命名为“remove-pattern.pl”，则在给定 data.txt 输入文件的情况下，将使用以下命令执行该脚本。 %> perl 删除-pattern.pl data.txt

生成的输出文件“data.txt.filtered”

--- data.txt.filtered ---
bla bla
pattern2
bla
pattern1
pattern2
bla
bla
pattern1

Question 3

要查找该行的行号：

lineno=$( nl file | tac | awk '/pattern1/ {last = $1} /pattern2/ {print last; exit}' )

用于nl向文件添加行号、
tac反转行
以及awk打印行号最后的“模式1”前这第一的“模式2”。

然后删除该行：

sed -i "${lineno}d" file

Answer

要查找该行的行号：

lineno=$( nl file | tac | awk '/pattern1/ {last = $1} /pattern2/ {print last; exit}' )

用于nl向文件添加行号、
tac反转行
以及awk打印行号最后的“模式1”前这第一的“模式2”。

然后删除该行：

sed -i "${lineno}d" file

Question 4

如果您只想在文件中进行一次传递并最大限度地减少内存中保存的行数，则可以使用awk状态机方法。这些并不是最短的解决方案，但很容易想出和阅读/维护。您可以用数字替换州名称，以使其（可能）更加高效。

PATTERN1=pattern1 PATTERN2=pattern2 awk '
  BEGIN {
    p1 = ENVIRON["PATTERN1"]
    p2 = ENVIRON["PATTERN2"]
    state = "init"
  }
  state == "init" {
    if ($0 ~ p2) state = "p2_found"
    print
    next
  }
  state == "p2_found" {
    if ($0 ~ p1) {
      state = "p1_found"
      p1_line = $0
      printf "%s", hold
      hold = ""
    } else if ($0 ~ p2) {
      # we can print the text held since the last p2
      printf "%s", hold
      hold = $0 RS
    } else hold = hold $0 RS
    next
  }
  state == "p1_found" {
    if ($0 ~ p2) {
      state = "p2_found"
      # the line that matched p1 is not discarded
      printf "%s\n%s", p1_line, hold;
      hold = ""
    }
    hold = hold $0 RS
  }
  END {
    # here we are not printing p1_line which is how it is discarded
    printf "%s", hold
  }'

（我假设没有任何行与pattern1和匹配pattern2）。

Answer

如果您只想在文件中进行一次传递并最大限度地减少内存中保存的行数，则可以使用awk状态机方法。这些并不是最短的解决方案，但很容易想出和阅读/维护。您可以用数字替换州名称，以使其（可能）更加高效。

PATTERN1=pattern1 PATTERN2=pattern2 awk '
  BEGIN {
    p1 = ENVIRON["PATTERN1"]
    p2 = ENVIRON["PATTERN2"]
    state = "init"
  }
  state == "init" {
    if ($0 ~ p2) state = "p2_found"
    print
    next
  }
  state == "p2_found" {
    if ($0 ~ p1) {
      state = "p1_found"
      p1_line = $0
      printf "%s", hold
      hold = ""
    } else if ($0 ~ p2) {
      # we can print the text held since the last p2
      printf "%s", hold
      hold = $0 RS
    } else hold = hold $0 RS
    next
  }
  state == "p1_found" {
    if ($0 ~ p2) {
      state = "p2_found"
      # the line that matched p1 is not discarded
      printf "%s\n%s", p1_line, hold;
      hold = ""
    }
    hold = hold $0 RS
  }
  END {
    # here we are not printing p1_line which is how it is discarded
    printf "%s", hold
  }'

（我假设没有任何行与pattern1和匹配pattern2）。

删除最后一次出现“pattern2”之后第一次出现“pattern1”的行吗？

答案1

答案2

答案3

答案4

相关内容