删除 Notepad++ 中除 URL 之外的所有内容

Question 1

Ctrl+H
找什么：^.*?(\bhttps://twitter\.com/\w+)?.*$
用。。。来代替：(?1$1:)
检查环绕
检查正则表达式
请勿检查. matches newline
Replace all

解释：

^                           # beginning of line
  .*?                       # 0 or more any character but newline, not greedy
  (                         # start grpup 1
    \b                      # word boundary
    https://twitter\.com/   # literally
    \w+                     # 1 or more word character
  )?                        # end group, optional
  .*                        # 0 or more any character but newline
$                           # end of line

替代品：

(?1$1:)         # if group 1 exists, then use it as replacement, else replace with nothing

给定示例的结果：

https://twitter.com/thtjournal


https://twitter.com/jcarrollhistory

Answer

Ctrl+H
找什么：^.*?(\bhttps://twitter\.com/\w+)?.*$
用。。。来代替：(?1$1:)
检查环绕
检查正则表达式
请勿检查. matches newline
Replace all

解释：

^                           # beginning of line
  .*?                       # 0 or more any character but newline, not greedy
  (                         # start grpup 1
    \b                      # word boundary
    https://twitter\.com/   # literally
    \w+                     # 1 or more word character
  )?                        # end group, optional
  .*                        # 0 or more any character but newline
$                           # end of line

替代品：

(?1$1:)         # if group 1 exists, then use it as replacement, else replace with nothing

给定示例的结果：

https://twitter.com/thtjournal


https://twitter.com/jcarrollhistory

Question 2

假设你有一个定义 URL 的正则表达式，我们将其称为正则表达式。

使用 Notepad++ 中的“查找”对话框的“替换”选项卡来执行全部替换的正则表达式通过\n$1\n。这会将所有 URL 分成仅包含 URL 的行，其中穿插着垃圾行。

再次在“查找”对话框的“标记”选项卡中，标记包含正则表达式使用书签线选项，使用全部标记手术。

最后，在搜索 => 书签菜单，选择选项删除未加书签的行。

有关 URL 的良好正则表达式，请参阅此帖子：
检查字符串是否为有效 URL 的最佳正则表达式是什么？。

有关更多信息和屏幕截图，请参阅类似案例的文章：
Notepad++如何从文件中提取电子邮件地址。

Answer