如何将增量计数附加到文本文件的每个预定义单词？

Question 1

我更喜欢perl这个：

$ cat ip.txt 
He drove his car to the cinema. He then went inside the cinema to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema.

$ # forward counting is easy
$ perl -pe 's/\bcinema\b/$&.++$i/ge' ip.txt 
He drove his car to the cinema1. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema3.

\bcinema\b要搜索的单词，使用单词边界，这样它就不会作为另一个单词的部分部分进行匹配。例如，\bpar\b不会匹配apart或park或spar
ge该g标志用于全局替换。e允许在替换部分使用 Perl 代码
$&.++$i是匹配单词和预递增值的串联，其$i默认值为0

对于反向，我们需要先得到计数......

$ c=$(grep -ow 'cinema' ip.txt | wc -l) perl -pe 's/\bcinema\b/$&.$ENV{c}--/ge' ip.txt 
He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema1.

c成为可通过哈希访问的环境变量%ENV

或者，perl单独使用整个文件

perl -0777 -pe '$c=()=/\bcinema\b/g; s//$&.$c--/ge' ip.txt

Answer

我更喜欢perl这个：

$ cat ip.txt 
He drove his car to the cinema. He then went inside the cinema to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema.

$ # forward counting is easy
$ perl -pe 's/\bcinema\b/$&.++$i/ge' ip.txt 
He drove his car to the cinema1. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema3.

\bcinema\b要搜索的单词，使用单词边界，这样它就不会作为另一个单词的部分部分进行匹配。例如，\bpar\b不会匹配apart或park或spar
ge该g标志用于全局替换。e允许在替换部分使用 Perl 代码
$&.++$i是匹配单词和预递增值的串联，其$i默认值为0

对于反向，我们需要先得到计数......

$ c=$(grep -ow 'cinema' ip.txt | wc -l) perl -pe 's/\bcinema\b/$&.$ENV{c}--/ge' ip.txt 
He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema1.

c成为可通过哈希访问的环境变量%ENV

或者，perl单独使用整个文件

perl -0777 -pe '$c=()=/\bcinema\b/g; s//$&.$c--/ge' ip.txt

Question 2

使用 GNU awk 进行多字符 RS、不区分大小写的匹配和字边界：

$ awk -v RS='^$' -v ORS= -v word='cinema' '
    BEGIN { IGNORECASE=1 }
    { cnt=gsub("\\<"word"\\>","&"); while (sub("\\<"word"\\>","&"cnt--)); print }
' file
He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema1.

Answer

使用 GNU awk 进行多字符 RS、不区分大小写的匹配和字边界：

$ awk -v RS='^$' -v ORS= -v word='cinema' '
    BEGIN { IGNORECASE=1 }
    { cnt=gsub("\\<"word"\\>","&"); while (sub("\\<"word"\\>","&"cnt--)); print }
' file
He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema1.

Question 3

考虑单词后面的标点符号。
正向编号：

word="cinema"
awk -v word="$word" '
    { 
      for (i = 1; i <= NF; i++) 
        if ($i ~ word "([,.;:)]|$)") { 
          gsub(word, word "" ++count,$i) 
        }
      print 
    }' input-file

向后编号：

word="cinema"
count="$(awk -v word="$word" '
    { count += gsub(word, "") }
    END { print count }' input-file)"
awk -v word="$word" -v count="$count" '
    { 
      for (i = 1; i <= NF; i++) 
        if ($i ~ word "([,.;:)]|$)") { 
          gsub(word, word "" count--, $i) 
        }
      print 
    }' input-file

Answer

考虑单词后面的标点符号。
正向编号：

word="cinema"
awk -v word="$word" '
    { 
      for (i = 1; i <= NF; i++) 
        if ($i ~ word "([,.;:)]|$)") { 
          gsub(word, word "" ++count,$i) 
        }
      print 
    }' input-file

向后编号：

word="cinema"
count="$(awk -v word="$word" '
    { count += gsub(word, "") }
    END { print count }' input-file)"
awk -v word="$word" -v count="$count" '
    { 
      for (i = 1; i <= NF; i++) 
        if ($i ~ word "([,.;:)]|$)") { 
          gsub(word, word "" count--, $i) 
        }
      print 
    }' input-file

Question 4

为了以降序标记单词，我们反转正则表达式并反转数据，最后再次反转日期以实现转换：

perl -l -0777pe '$_ = reverse reverse =~ s/(?=\bamenic\b)/++$a/gre' input.data

结果

He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and
afterwards discovered that it was more then two years since he last visited the cinema1.

为了按升序标记单词，我们对单词进行后向搜索：

perl -lpe 's/\bcinema\b\K/++$a/eg' input.data

结果

He drove his car to the cinema1. He then went inside the cinema2 to purchase tickets, and
afterwards discovered that it was more then two years since he last visited the cinema3.

Answer

为了以降序标记单词，我们反转正则表达式并反转数据，最后再次反转日期以实现转换：

perl -l -0777pe '$_ = reverse reverse =~ s/(?=\bamenic\b)/++$a/gre' input.data

结果

He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and
afterwards discovered that it was more then two years since he last visited the cinema1.

为了按升序标记单词，我们对单词进行后向搜索：

perl -lpe 's/\bcinema\b\K/++$a/eg' input.data

结果

He drove his car to the cinema1. He then went inside the cinema2 to purchase tickets, and
afterwards discovered that it was more then two years since he last visited the cinema3.

如何将增量计数附加到文本文件的每个预定义单词？

答案1

答案2

答案3

答案4

结果

结果

相关内容