从 2 个模式获取 CSV 并忽略所有其余内容

从 2 个模式获取 CSV 并忽略所有其余内容

有一种模式,它的开头相同,但结尾略有不同,一个模式从开头继续,然后我想保留该行剩余部分,另一个模式我想立即保留,但在该行结束之前我想忽略一些东西,两者之间也有一些我想忽略的内容。

在以下示例中,我想制作一个甜点†的 CSV 文件,其中蔬菜是图案,lorem ipsum 是中间的线条。我想用 Notepad++ 来做这件事。到目前为止,我已经.*?carrot\Rpotato (?:cabbage (.*?)|(.*?) turnip)\R.*?用 替换了\1,,但这似乎不起作用;我还怀疑这不是获得我想要的东西的最有效方法。

非常感谢你

Lorem ipsum dolor sit amet, consectetur adipiscing elit.
carrot
potato cheese cake turnip
Vivamus aliquet nibh semper sem sodales mattis.
In a mauris nec eros pulvinar accumsan.
carrot
potato cabbage chocolate muffin
Mauris leo lacus, luctus non libero id, mattis gravida tellus.
Nunc eget purus at sapien varius fermentum.
carrot
potato vanilla pudding turnip
Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Donec et felis orci.
carrot
potato cabbage chocolate-covered peanuts
Cras convallis semper erat, sed semper ante lacinia vitae.
Fusce vitae lacus et erat placerat malesuada. 

预期结果:

cheese cake, chocolate muffin, vanilla pudding, chocolate-covered peanuts

答案1

  • Ctrl+H
  • 找什么:.*?carrot\Rpotato (?|cabbage ((?:(?!turnip).)*)\R?|((?:(?!cabbage).)*) turnip).*\R?|.+\R?
  • 用。。。来代替:(?1$1,)
  • 打钩 相符
  • 打钩 环绕
  • 选择 正则表达式
  • 取消勾选 . matches newline
  • Replace all

解释:

.*?             # 0 or more any character but newline, not greedy
carrot          # literally
\R              # any kind of linebreak
potato          # literally
(?|             # branch reset groups
    cabbage         # literally
    (               # group 1
        # tempered greedy token
        (?:             # non capture group
            (?!turnip)      # negative lookahead, make sure we haven't turnip after
            .               # any character but newline
        )*              # end group, may appear 0 or more times
    )               # end group 1
    \R?             # optional linebreak
  |               # OR
    (               # group 1 (same group number as above because of the directive "branch reset groups")
        # tempered greedy token
        (?:             # non capture group
            (?!cabbage)     # negative lookahead, make sure we haven't cabbage after
            .               # any character but newline
        )*              # end group, may appear 0 or more times
    )               # end group 1
     turnip         # literally
)               # end branch reset groups
    .*\R?           # 0 or more any character followed by optional linebreak
|               # OR
    .+\R?           # 1 or more any character followed by optional linebreak

替代品:

(?1         # if group 1 exists
    $1,         # print the content of group 1 followed by a comma
)           # end if

截图(之前):

在此处输入图片描述

截图(之后):

在此处输入图片描述

相关内容