我对 Notepad++ 还很陌生,试图使用 Regex 搜索字段中的特定值并删除其父标签(以及包括该字段在内的所有内容)。
本质上,我试图删除具有特定商店 ID 的交易。文件非常庞大,我需要删除的条目数以千计,示例如下!
样本
<Transaction>
<TxnHeader>
<StoreId>6705</StoreId>
<TillNumber>1</TillNumber>
<TxnNumber>343243</TxnNumber>
<StartDate>2019-02-02T07:42:45</StartDate>
<TxnType>1</TxnType>
</TxnHeader>
<TxnItemLines>
<TxnItemLine>
<DetailSequence>1</DetailSequence>
<ItemNumber>6304</ItemNumber>
<DeptNumber>168</DeptNumber>
<Quantity>1.000000</Quantity>
<LineValue>4.470000</LineValue>
</TxnItemLine>
</TxnItemLines>
</Transaction>
<Transaction>
<TxnHeader>
<StoreId>8351</StoreId>
<TillNumber>1</TillNumber>
<TxnNumber>327527</TxnNumber>
<StartDate>2019-02-02T08:02:47</StartDate>
<TxnType>1</TxnType>
</TxnHeader>
<TxnItemLines>
<TxnItemLine>
<DetailSequence>1</DetailSequence>
<ItemNumber>6304</ItemNumber>
<DeptNumber>168</DeptNumber>
<Quantity>1.000000</Quantity>
<LineValue>7.310000</LineValue>
</TxnItemLine>
</TxnItemLines>
</Transaction>
<Transaction>
<TxnHeader>
<StoreId>7837</StoreId>
<TillNumber>1</TillNumber>
<TxnNumber>164728</TxnNumber>
<StartDate>2019-02-02T08:19:47</StartDate>
<TxnType>1</TxnType>
</TxnHeader>
<TxnItemLines>
<TxnItemLine>
<DetailSequence>1</DetailSequence>
<ItemNumber>1902</ItemNumber>
<DeptNumber>154</DeptNumber>
<Quantity>1.000000</Quantity>
<LineValue>10.000000</LineValue>
</TxnItemLine>
</TxnItemLines>
</Transaction>
期望
<Transaction>
<TxnHeader>
<StoreId>6705</StoreId>
<TillNumber>1</TillNumber>
<TxnNumber>343243</TxnNumber>
<StartDate>2019-02-02T07:42:45</StartDate>
<TxnType>1</TxnType>
</TxnHeader>
<TxnItemLines>
<TxnItemLine>
<DetailSequence>1</DetailSequence>
<ItemNumber>6304</ItemNumber>
<DeptNumber>168</DeptNumber>
<Quantity>1.000000</Quantity>
<LineValue>4.470000</LineValue>
</TxnItemLine>
</TxnItemLines>
</Transaction>
<Transaction>
<TxnHeader>
<StoreId>7837</StoreId>
<TillNumber>1</TillNumber>
<TxnNumber>164728</TxnNumber>
<StartDate>2019-02-02T08:19:47</StartDate>
<TxnType>1</TxnType>
</TxnHeader>
<TxnItemLines>
<TxnItemLine>
<DetailSequence>1</DetailSequence>
<ItemNumber>1902</ItemNumber>
<DeptNumber>154</DeptNumber>
<Quantity>1.000000</Quantity>
<LineValue>10.000000</LineValue>
</TxnItemLine>
</TxnItemLines>
</Transaction>
上面所需的文本已完全删除包含 8351 的交易标签
我尝试使用以下查询进行正则表达式查找和替换(不包含任何内容):
<Transaction>.*?<StoreID>8351</StoreID>.*?</Transaction>
最后它把文档的一大块从顶部一直折回到第一个交易的末尾,交易内容是 8351
任何帮助将不胜感激!
答案1
- Ctrl+H
- 找什么:
<Transaction>(?:(?!</Transaction>).)+<StoreId>8351</StoreId>(?:(?!<Transaction>).)+</Transaction>\R
- 用。。。来代替:
LEAVE EMPTY
- 检查匹配大小写
- 检查环绕
- 检查正则表达式
- 查看
. matches newline
- Replace all
解释:
<Transaction> # opening tag
(?:(?!</Transaction>).)+ # tempered greedy token, make sure we haven't </Transaction> before the following
<StoreId>8351</StoreId> # literally
(?:(?!<Transaction>).)+ # tempered greedy token, make sure we haven't <Transaction> before the following
</Transaction> # literally, closing tag
\R? # optional any kind of linebreak
屏幕截图:
更多关于淬炼贪婪令牌