我有这个 html 标签(来自字符串):
<meta name="description" content="I love my mother" but I love my sister" more than I can say"/>
如你所见,我在内容部分。应该只有 2 个双引号:一个在开头content="
,一个在结尾"/>
我必须找到内容部分中除那 2 个之外的所有包含其他双引号的标签,然后删除它们:
输出应该是:
<meta name="description" content="I love my mother but I love my sister more than I can say"/>
我做了一个正则表达式,但效果不太好。也许你可以帮助我:
寻找:(?-s)(<meta name="description" content=")(*?\K.*"(?s))"/>
替换为:\1\2
答案1
以下是一种方法:
- Ctrl+H
- 找什么:
(?:<meta name="description" content="|\G(?!^))[^"]*\K"(?=.*?"/>)
- 用。。。来代替:
LEAVE EMPTY
- 查看 环绕
- 查看 正则表达式
- 取消选中
. matches newline
- Replace all
解释:
(?: # non capture group
<meta name="description" content=" # literally
| # OR
\G(?!^) # restart from last match position (not at the beginning of a line)
) # end group
[^"]* # 0 or more non quote
\K # forget all we have seen until this position
" # a double quote
(?=.*?"/>) # positive lookahead, make sure we have "/> somewhere after
截图(之前):
截图(之后):