grep 正斜杠到正斜杠

Question 1

和grep：

grep -P -o '/.*?/[0-9]+/'

和sed：

sed -E 's|["]*(/.*?)(/[0-9]+/).*|\1\2|'

Answer

和grep：

grep -P -o '/.*?/[0-9]+/'

和sed：

sed -E 's|["]*(/.*?)(/[0-9]+/).*|\1\2|'

Question 2

cat file.txt
"/index.php/pub/xx/en/details/123456/"
"/index.php/pub/xx/en/details/993455/xxx/ff/3e/"
"/index.php/pub/xx/en/details/74939300/"
"/index.php/pub/xx/en/details/9584443/"
"/index.php/pub/xx/en/details/9583832/cdf/dr/wwe/"

grep -Po '"\K[^"]+?/\d+/(?=")' file.txt
/index.php/pub/xx/en/details/123456/
/index.php/pub/xx/en/details/74939300/
/index.php/pub/xx/en/details/9584443/

解释：

-Po             # Perl regex, only the matched string
"               # a double quote
\K              # forget it
[^"]+?          # 1 or more non double quote, not greedy
/               # a slash
\d+             # 1 or more digits
/               # a slash
(?=")           # positive lookahead, make sure we have a double quote after

Answer

cat file.txt
"/index.php/pub/xx/en/details/123456/"
"/index.php/pub/xx/en/details/993455/xxx/ff/3e/"
"/index.php/pub/xx/en/details/74939300/"
"/index.php/pub/xx/en/details/9584443/"
"/index.php/pub/xx/en/details/9583832/cdf/dr/wwe/"

grep -Po '"\K[^"]+?/\d+/(?=")' file.txt
/index.php/pub/xx/en/details/123456/
/index.php/pub/xx/en/details/74939300/
/index.php/pub/xx/en/details/9584443/

解释：

-Po             # Perl regex, only the matched string
"               # a double quote
\K              # forget it
[^"]+?          # 1 or more non double quote, not greedy
/               # a slash
\d+             # 1 or more digits
/               # a slash
(?=")           # positive lookahead, make sure we have a double quote after

Question 3

你说你想 grep “正斜杠到正斜杠”。我猜这意味着你想得到第一个斜杠通过最后一个斜杠，省略第一个斜杠之前和最后一个斜杠之后的任何字符（即"引号，在您的数据中）。没有任何解释，您表明您只想获取恰好有六个路径名组件的行；即七个斜杠。

一个命令，使用 PCRE：

grep -Po '(?<=")/([^/]*/){6}(?=")' file.txt

两个命令，无需 PCRE：

grep -E '"/([^/]*/){6}"' file.txt | grep -o '/.*/'

Answer

你说你想 grep “正斜杠到正斜杠”。我猜这意味着你想得到第一个斜杠通过最后一个斜杠，省略第一个斜杠之前和最后一个斜杠之后的任何字符（即"引号，在您的数据中）。没有任何解释，您表明您只想获取恰好有六个路径名组件的行；即七个斜杠。

一个命令，使用 PCRE：

grep -Po '(?<=")/([^/]*/){6}(?=")' file.txt

两个命令，无需 PCRE：

grep -E '"/([^/]*/){6}"' file.txt | grep -o '/.*/'

grep 正斜杠到正斜杠

答案1

答案2

答案3

相关内容