仅当重复项是连续的:

仅当重复项是连续的:

你好,我尝试使用 notepad++ 中的 TextFX 插件删除文本文件中的重复项,但它对这种类型的文本不起作用

209.116.247.120|admin|default|Taiwan (TW)|Tai-pei|Taipei|Unknown
209.116.247.120|admin|default|
209.116.49.130|admin|admin
209.116.49.130|admin|admin|China (CN)|Henan|Zhengzhou|Unknown
209.116.55.142|admin|admin
209.116.55.142|admin|admin|Korea, Republic of (KR)|Seoul-t'ukpyolsi|Seoul|Unknown
209.116.65.26|admin|admin
209.116.65.26|admin|admin|New Zealand (NZ)|Unknown|Unknown|Unknown

如您所见,添加的国家/地区存在重复项,因此我想删除这些重复项

209.116.247.120|admin|default|
209.116.49.130|admin|admin
209.116.55.142|admin|admin
209.116.65.26|admin|admin

或者这些重复项

209.116.247.120|admin|default|Taiwan (TW)|Tai-pei|Taipei|Unknown
209.116.49.130|admin|admin|China (CN)|Henan|Zhengzhou|Unknown
209.116.55.142|admin|admin|Korea, Republic of (KR)|Seoul-t'ukpyolsi|Seoul|Unknown
209.116.65.26|admin|admin|New Zealand (NZ)|Unknown|Unknown|Unknown

如果有人有任何想法或正则表达式命令来解决这个问题,我将不胜感激并提供命令,谢谢。

答案1

仅当重复项是连续的:

  • Ctrl+H
  • 找什么:^(([^|]+[|][^|]+[|][^|]+)[|]?.*)\R\2
  • 用。。。来代替:$1
  • Replace all

解释:

^           : begining of line
(           : start group 1
  (         : start group 2
    [^|]+   : 1 or more NON pipe character |
    [|]     : a pipe
    [^|]+   : 1 or more NON pipe character |
    [|]     : a pipe
    [^|]+   : 1 or more NON pipe character |
  )         : end group 2
  [|]?      : a pipe, optional
  .*        : 0 or more any character but newline
)           : end group 1
\R          : any kind of line break
\2          : backreference to group 2
  • 请勿检查. matches newline

替代品:

$1          : content of group, the first dupplicate line

给定示例的结果:

209.116.247.120|admin|default|Taiwan (TW)|Tai-pei|Taipei|Unknown|
209.116.49.130|admin|admin|China (CN)|Henan|Zhengzhou|Unknown
209.116.55.142|admin|admin|Korea, Republic of (KR)|Seoul-t'ukpyolsi|Seoul|Unknown
209.116.65.26|admin|admin|New Zealand (NZ)|Unknown|Unknown|Unknown

相关内容