提取从第一个下划线开始并以第一个连字符结束的文件名

提取从第一个下划线开始并以第一个连字符结束的文件名

我在目录中有一个文件列表,我想构建一个简单的正则表达式来 grep 从第一个下划线位置开始'_'到第一个之间的文件名'-'

例如:

2180_PP AAA Radius Statistic-42005_04May2020_0900-04May2020_1000.csv
2180_SW Interface  Flow(3GPP AAA)-53448_14May2020_0000-14May2020_0100.csv

预期的文件名如下:

PP AAA Radius Statistic
SW Interface  Flow(3GPP AAA)

我发现了类似的模式,但在我的案例中没有完全起作用。

echo 2180_SW Interface  Flow(3GPP AAA)-53448_14May2020_0000-14May2020_0100.csv | grep -oP '(?<=_)\d+(?=\-)'

答案1

man grep

  grep searches for PATTERNS in each FILE.  PATTERNS is one or patterns separated by newline characters, and grep prints each line that matches a pattern.

 -o, --only-matching
              Print  only  the  matched  (non-empty) parts of a matching line,
              with each such part on a separate output line.

 -P, --perl-regexp
              Interpret   PATTERNS   as  Perl-compatible  regular  expressions
              (PCREs).  This option is experimental when combined with the  -z
              (--null-data)  option,  and  grep  -P  may warn of unimplemented features.

ls我有

'2180_PP AAA Radius Statistic-42005_04May2020_0900-04May2020_1000.csv'
'2180_SW Interface  Flow(3GPP AAA)-53448_14May2020_0000-14May2020_0100.csv'

运行下面的代码后,我得到了

ls | grep -oP '(?<=_).*(?=\-\d\d\d)'
PP AAA Radius Statistic
SW Interface  Flow(3GPP AAA)

的解释REGEX

(?<= - Stands for a positive look-behind and will not include the words before it

.    - Matches any characters except line break

(?=  - Stands for positive look-ahead. Matches a group 
       after the main result without including it in the result.

\-   - Matched character -

\d   - Matched digit

REGEX 解释的来源是正则表达式

为什么你可能会得到不同的结果?

-输入中是否存在另一个匹配项(-14May)。所以我\-\d\d\d过去常常抵制这一点。

相关内容