我在目录中有一个文件列表,我想构建一个简单的正则表达式来 grep 从第一个下划线位置开始'_'
到第一个之间的文件名'-'
。
例如:
2180_PP AAA Radius Statistic-42005_04May2020_0900-04May2020_1000.csv
2180_SW Interface Flow(3GPP AAA)-53448_14May2020_0000-14May2020_0100.csv
预期的文件名如下:
PP AAA Radius Statistic
SW Interface Flow(3GPP AAA)
我发现了类似的模式,但在我的案例中没有完全起作用。
echo 2180_SW Interface Flow(3GPP AAA)-53448_14May2020_0000-14May2020_0100.csv | grep -oP '(?<=_)\d+(?=\-)'
答案1
man grep
说
grep searches for PATTERNS in each FILE. PATTERNS is one or patterns separated by newline characters, and grep prints each line that matches a pattern.
-o, --only-matching
Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.
-P, --perl-regexp
Interpret PATTERNS as Perl-compatible regular expressions
(PCREs). This option is experimental when combined with the -z
(--null-data) option, and grep -P may warn of unimplemented features.
在ls
我有
'2180_PP AAA Radius Statistic-42005_04May2020_0900-04May2020_1000.csv'
'2180_SW Interface Flow(3GPP AAA)-53448_14May2020_0000-14May2020_0100.csv'
运行下面的代码后,我得到了
ls | grep -oP '(?<=_).*(?=\-\d\d\d)'
PP AAA Radius Statistic
SW Interface Flow(3GPP AAA)
的解释REGEX
(?<= - Stands for a positive look-behind and will not include the words before it
. - Matches any characters except line break
(?= - Stands for positive look-ahead. Matches a group
after the main result without including it in the result.
\- - Matched character -
\d - Matched digit
REGEX 解释的来源是正则表达式
为什么你可能会得到不同的结果?
-
输入中是否存在另一个匹配项(-14May)。所以我\-\d\d\d
过去常常抵制这一点。