我有这些单词,其中至少三个可以在英语的任何一个句子中出现。
was, where, were, some, then, than, that, can, by, the, and, with, over, there, is, as, also, through, from, while, just, like, for, such, if, else, still, again, want, will, wish, make, made, well, have, had, has, it, be, do, say, others, go, know, see, think, look, give, use, find, tell, ask, work, seem, feel, try, leave, call, get, take, too, in, addition, to, could, who, he, she, because, of, your, yours, their, doesn't, are, an, these, this, those, but, at, whom, or, out, how, when, between, his, her, they, them, my, without, maybe, even, show, can't, must, couldn't, now, i'm, many, come, own, self, seen, it’s, we, any, other, coming, so, found, more, much, all, very, same, did, which, does, on
另外,我有这两个html标签,但只有第一个的内容是英文的:
<meta name="description" content="Simply Red are a British soul and pop band which formed in Manchester in 1985. The lead vocalist of the band is singer and songwriter Mick Hucknall by">
以及一个俄语标签:
<meta name="description" content="Simply Red - британская соул- и поп-группа, образованная в Манчестере в 1985 году. Ведущим вокалистом группы является певец и автор песен Мик Хакнелл.">
所以,我想检查所有包含用英文书写的标签的 html 文件。为此,我必须找到那些从开头就包含至少 3 个该关键字的 html 标签。
我的正则表达式只有几个词(简短版本),如下所示:
搜索:(?-s)<meta name="description".+?(?:(was|is|as|on|and|in)).+>
更大版本将是:
(?-s)<meta name="description".*?(was|where|were|some|then|than|that|can|by|the|and|with|over|there|is|as|also|through|from|while|just|like|for|such|if|else|still|again|want|will|wish|make|made|well|have|had|has|it|be|do|say|others|go|know|see|think|look|give|use|find|tell|ask|work|seem|feel|try|leave|call|get|take|too|in|addition|to|could|who|he|she|because|of|your|yours|their|doesn't|are|an|these|this|those|but|at|whom|or|out|how|when|between|his|her|they|them|my|without|maybe|even|show|can't|must|couldn't|now|i'm|many|come|own|self|seen|it’s|we|any|other|coming|so|found|more|much|all|very|same|did|which|does|on).+>
好的,问题是我的正则表达式也找到了第二个标签,其内容是用俄语写的。我必须只找到第一个(英文)