查找首字母缩略词列表并创建首字母缩略词列表(有由 2 个或 3 个单词组成的首字母缩略词)

查找首字母缩略词列表并创建首字母缩略词列表(有由 2 个或 3 个单词组成的首字母缩略词)

我曾经使用\b[A-Z]{2,4}\b匹配大小写功能来匹配首字母缩略词,但我就是无法删除首字母缩略词周围的任何其他文本来创建匹配条目的列表。我们是否也可以使用停用词列表来删除所有非首字母缩略词的大写单词,例如单词 NOTE?

在此处输入图片描述

请参阅以下文本示例:

GUIDANCE NOTE:
Provision of transportation services to the UN RC and RCO staff
As part of the turn-key solutions, governed by the MoU signed between UN and UNDP on 21 December 2018, UNDP is required to provide transportation services (para 2.2. and 2.3., Annex 1) to the UN RCS offices as follows: which include the following services:
- Provision to the UN RC of 1 (one) vehicle on a full-time basis
- Provision to the UN RC office (RCO) of vehicle(s) on a part-time basis 
Para 2.3. of the MoU states that all services, whether “turnkey” or “pay-as-you-go”, should be provided by UNDP in accordance with UNDP rules, policies and procedures. Therefore, all transportation services provided to the UN RCS offices by UNDP and the use of UNDP vehicles by the UN RC offices will follow the rules and procedures outlined in UNDP Vehicle Management Policy. The Policy can be accessed through the link HERE.
UNDP Country Offices are advised to consider obtaining comprehensive insurance coverage for the full-time vehicle allocated to the UN RC in order to mitigate legal, financial and other risks.
Cost recovery methodology for vehicles provided to RCO on the full-time or part-time basis is provided below.

答案1

  • Ctrl+H
  • 找什么:.*?((?!NOTE|HERE)\b[A-Z]{2,4}\b)(?:(?![A-Z]{2,4}).)*
  • 用。。。来代替:$1\n
  • 查看 相符
  • 查看 环绕
  • 查看 正则表达式
  • 查看 . matches newline
  • Replace all

解释:

.*?                     # 0 or more any character, not greedy
(                       # start group 1
    (?!NOTE|HERE)       # negative lookahead, make sure we haven't NOTE or HERE after
                              # you can add other words pipe separated if needed
    \b                  # word boundary
    [A-Z]{2,4}          # 2 upto 4 uppercases
    \b                  # word boundary
)                       # end group 1
                    #Tempered greedy token
(?:                     # non capture group
    (?![A-Z]{2,4})      # negative lookahead, not 2 upto 4 uppercases
    .                   # any character
)*                      # end group, may apear 0 or more times

替代品:

$1          # content of group 1 (i.e. the acronym)
\n          # linefeed, you can use \r\n for Windows linebreak

截图(之前):

在此处输入图片描述

截图(之后):

在此处输入图片描述

答案2

或者,当你只感兴趣独特的匹配的值而不是完整列表:

[\s\S]*?((?!NOTE|HERE)\b[A-Z]{2,4}\b)(?:(?![A-Z]{2,4}).)*(?![\s\S]*\b\1\b[\s\S]*)

查看这里

按照@Toto 给出的步骤操作这里

结果 - 唯一缩写词列表:

在此处输入图片描述

相关内容