正则表达式:查找具有特定标签且包含以小写字母开头的单词的行

正则表达式:查找具有特定标签且包含以小写字母开头的单词的行

我想找到那些带有特定标签的行,该标签包含至少 2 个以小写字母开头的单词。例如

<span class="text_obisnuit2">I love my house</span>(这种类型的)

并不是

<span class="text_obisnuit2">I Love My House</span>

我尝试了正则表达式,但效果不太好:

寻找:(?:\G(?!^)|<span class="text_obisnuit2">)\s*\K(</span>*)|\u$1\L$2

也许你可以帮助我。

答案1

  • Ctrl+F
  • 找什么:<span class="text_obisnuit2">(?:(?:(?!</span>).)*?\b[a-z]){2}.*?</span>
  • 查看 相符
  • 查看 环绕
  • 查看 正则表达式
  • 取消选中 . matches newline
  • Find Next

或者

  • Find All in Current Document

解释:

<span class="text_obisnuit2">       # literally, opening tag
  (?:                               # non capture group
    (?:                             # non capture group
      (?!</span>)                   # negative lookahead, make sure we haven't end tag after
      .                             # any character but newline
    )*?                             # end group, may appear 0 or more times, not greedy
    \b                              # word boundary, make sure we are at the beginning of a word
    [a-z]                           # 1 lowercase letter
  ){2}                              # end group, must appear twice
  .*?                               # 0 or more any character
</span>                             # end tag

截屏:

在此处输入图片描述


根据 OP 想要将首字母大写的评论进行编辑:

  • 找什么:(?:<span class="text_obisnuit2">|\G)(?:(?!</span>).)*?\K\b([a-z])(\w+)(?=.*?</span>)
  • 用。。。来代替:\u$1\l$2
  • 查看 环绕
  • 查看 正则表达式
  • 取消选中 . matches newline
  • Replace all

解释:

(?:                                 # non capture group
  <span class="text_obisnuit2">     # literally, opening tag
 |                                 # OR
  \G                                # restart from last match position
)                                   # end group
(?:(?!</span>).)*?                  # Tempered Greedy Token, make sure we haven't </span>
\K                                  # forget all we have seen until this position
\b                                  # word boundary
([a-z])                             # group 1, a lowercase letter
(\w+)                               # group 2, 1 or more word characters
(?=.*?</span>)                      # positive lookahead, make sure we have a closing tag after

替代品:

\u$1        # Uppercase group 1, the first letter
\l$2        # lowercase group 2, the rest of the word

笔记:由于 Notepad++ 中的一个“错误”,您不能使用单个组,只能\u$1用于替换

截图(之前):

在此处输入图片描述

截图(之后):

在此处输入图片描述

答案2

这个简单的正则表达式将找到这样的文本:

<span class="text_obisnuit2">\w+ [a-z]

它将查找任何后跟空格和小写字母的单词。


在此处输入图片描述

相关内容