排除 sed 中某个字符之前的字符

Question 1

$ sed -r 's/.* ([^ ]+\.[^ ]+).* ([^ ]+)$/\1 \2/' orange
orange.5678 you

解释

-r使用扩展正则表达式
s/old/newold用。。。来代替new
.*任意数量的任意字符
(some characters)保存some characters以供稍后替换时参考
[^ ]+一些不是空格的字符
\.文字点
$行结束
\1反向引用已保存的模式

所以s/.* ([^ ]+\.[^ ]+).* ([^ ]+)$/\1 \2/意味着，将行中的任何内容匹配到一些非空格字符之前的空格.，然后是它后面的一些非空格字符（将这些字符保存在的两侧.），然后匹配任何字符并保存最后一组行上的非空格字符，并将整个匹配替换为以空格分隔的两个保存的模式

Answer

$ sed -r 's/.* ([^ ]+\.[^ ]+).* ([^ ]+)$/\1 \2/' orange
orange.5678 you

解释

-r使用扩展正则表达式
s/old/newold用。。。来代替new
.*任意数量的任意字符
(some characters)保存some characters以供稍后替换时参考
[^ ]+一些不是空格的字符
\.文字点
$行结束
\1反向引用已保存的模式

所以s/.* ([^ ]+\.[^ ]+).* ([^ ]+)$/\1 \2/意味着，将行中的任何内容匹配到一些非空格字符之前的空格.，然后是它后面的一些非空格字符（将这些字符保存在的两侧.），然后匹配任何字符并保存最后一组行上的非空格字符，并将整个匹配替换为以空格分隔的两个保存的模式

Question 2

最简单的方法：

awk '{print $2, $6}' file.txt

如果您的实际用例比您的问题所表明的更复杂，并且您需要额外的逻辑（例如，如果它不是总是您需要的第二个和第六个字段），编辑你的问题澄清。

Answer

最简单的方法：

awk '{print $2, $6}' file.txt

如果您的实际用例比您的问题所表明的更复杂，并且您需要额外的逻辑（例如，如果它不是总是您需要的第二个和第六个字段），编辑你的问题澄清。

Question 3

人们应该看看@Zanna 的另一个答案。非常优雅，展示了正则表达式的强大功能。

尝试使用这个表达式gawk。普通 awk 不适用于分组。

^(?:\w+\s){0,}(\w+\.\w+)(?:\s\w+){0,}\s(\w+)$

它适用于以下变化

apple orange.5678 dog cat 009 you
apple apple grape.9991 pig cat piegon owl
grape.9991 pig cat piegon owl

这里是表达式的描述。

/
^(?:\w+\s){0,}(\w+\.\w+)(?:\s\w+){0,}\s(\w+)$
/
g
^ asserts position at start of the string

Non-capturing group (?:\w+\s){0,}
{0,} Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
\s matches any whitespace character (equal to [\r\n\t\f\v ])

1st Capturing Group (\w+\.\w+)
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
\. matches the character . literally (case sensitive)
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)

Non-capturing group (?:\s\w+){0,}
{0,} Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\s matches any whitespace character (equal to [\r\n\t\f\v ])
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
\s matches any whitespace character (equal to [\r\n\t\f\v ])

2nd Capturing Group (\w+)
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)

Answer

人们应该看看@Zanna 的另一个答案。非常优雅，展示了正则表达式的强大功能。

尝试使用这个表达式gawk。普通 awk 不适用于分组。

^(?:\w+\s){0,}(\w+\.\w+)(?:\s\w+){0,}\s(\w+)$

它适用于以下变化

apple orange.5678 dog cat 009 you
apple apple grape.9991 pig cat piegon owl
grape.9991 pig cat piegon owl

这里是表达式的描述。

/
^(?:\w+\s){0,}(\w+\.\w+)(?:\s\w+){0,}\s(\w+)$
/
g
^ asserts position at start of the string

Non-capturing group (?:\w+\s){0,}
{0,} Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
\s matches any whitespace character (equal to [\r\n\t\f\v ])

1st Capturing Group (\w+\.\w+)
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
\. matches the character . literally (case sensitive)
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)

Non-capturing group (?:\s\w+){0,}
{0,} Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\s matches any whitespace character (equal to [\r\n\t\f\v ])
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
\s matches any whitespace character (equal to [\r\n\t\f\v ])

2nd Capturing Group (\w+)
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)

Question 4

如果必须使用正则表达式进行 sed，那么上面的答案将涵盖您。如果您愿意接受替代方案：

gv@debian: $ read -r a b c d e f<<<"apple orange.5678 dog cat 009 you" && echo "$b $f" 
orange.5678 you

如果这是文件中的一行，则替换<<<"...."为<file

此方法的工作需要默认 IFS = space。如果在 doube 中，请IFS=" "在开始时应用。

Answer

如果必须使用正则表达式进行 sed，那么上面的答案将涵盖您。如果您愿意接受替代方案：

gv@debian: $ read -r a b c d e f<<<"apple orange.5678 dog cat 009 you" && echo "$b $f" 
orange.5678 you

如果这是文件中的一行，则替换<<<"...."为<file

此方法的工作需要默认 IFS = space。如果在 doube 中，请IFS=" "在开始时应用。

排除 sed 中某个字符之前的字符

答案1

解释

答案2

答案3

答案4

相关内容