我的文件中有一些行,例如:
This is one word1:word2 of the lines
This is another word3:word4 of the lines
Line without a match
Yet another line word5:word6 for test
我需要 grep:
并返回之前和之后的单词:
。
我需要从以上几行 grep 得到的输出是
word1:word2
word3:word4
word5:word6
答案1
使用 GNU grep
:
start cmd:> echo "This is one word1:word2 of the lines" |
grep -Eo '[[:alnum:]]+:[[:alnum:]]+'
word1:word2
start cmd:> echo "This is one wordx:wordy of the lines" |
grep -Eo '[[:alpha:]]*:[[:alpha:]]*'
wordx:wordy
start cmd:> echo "This is one wo_rdx:wo_rdy of the lines" |
grep -Eo '[[:alpha:]_]*:[[:alpha:]_]*'
wo_rdx:wo_rdy
答案2
POSIXly(尽管要注意某些tr
实现(例如 GNU 的)不能正确处理多字节字符)。
tr -s '[:space:]_' '[\n*]' << 'EOF' |
grep -xE '[[:alnum:]_]+:[[:alnum:]_]+'
This is one word1:word2 of the lines and another is word:word
This is another word3:word4 of the lines and this is not wordnot::wordnot
Line without a match
Yet another line word5:word6 for test
This is one wo_rdx:wo_rdy of the lines
This is one wordx:wordy of the lines
not/a:match
EOF
给出:
word1:word2
word:word
word3:word4
word5:word6
rdx:wo
wordx:wordy
答案3
对于您想要的结果的所有情况,您可以使用grep
带有 PCRE support( -P
) 的 GNU 及其单词正则表达式 ( \w
),如下所示:
grep -oP '\w+:\w+' file
输入文件:
This is one word1:word2 of the lines and another is word:word
This is another word3:word4 of the lines and this is not wordnot::wordnot
Line without a match
Yet another line word5:word6 for test
This is one wo_rdx:wo_rdy of the lines
This is one wordx:wordy of the lines
输出:
word1:word2
word:word
word3:word4
word5:word6
wo_rdx:wo_rdy
wordx:wordy
正如您所看到的,与模式grep
不匹配,因为它本身之间wordnot::wordnot
有额外的内容。:
答案4
通过 grep,
grep -oP '[^:\s]+:[^:\s]+' file
或者
grep -oP '\S+?:\S+' file
上面的命令不仅获取字符串foo:bar
,而且还获取?foo:bar?