在一行中找到匹配的 URL,然后将其缩短为域名

在一行中找到匹配的 URL,然后将其缩短为域名

我有一个正在为其编写脚本的日志文件,因此它仅显示输出的某些字段。我需要的最后一点是缩短 URL,以便当它们点击“.com”、“.edu”、“.org”等时停止该行。有没有办法用 grep 来做到这一点?我应该寻找其他命令吗?

示例输出为:

student1234 "GET https://www.noname.com:443/login"
student4567 "GET http:// www.noip.edu:80/start/noname"
student8901 "GET http:// www.testing.org:80/search/change"

我需要的是:

student1234 "GET https://www.noname.com
student4567 "GET http:// www.noip.edu
student8901 "GET http:// www.testing.org

答案1

这么多选择,选你喜欢的吧。

使用grep

grep -o '^[^:]\+:[^:]\+' file.txt

使用cut

cut -d: -f1-2 file.txt

使用awk

awk -F: '{ print $1$2 }' file.txt

使用sed

sed 's/^\([^:]\+:[^:]\+\).*/\1/' file.txt

使用外壳:

while IFS=: read -r i j k; do echo "$i$j"; done <file.txt

使用perl

perl -pe 's/^([^:]+:[^:]+).*/$1/' file.txt

例子:

$ grep -o '^[^:]\+:[^:]\+' file.txt
student1234 "GET https://www.noname.com
student4567 "GET http:// www.noip.edu
student8901 "GET http:// www.testing.org

$ cut -d: -f1-2 file.txt                                                
student1234 "GET https://www.noname.com
student4567 "GET http:// www.noip.edu
student8901 "GET http:// www.testing.org

$ awk -F: '{ print $1$2 }' file.txt                 
student1234 "GET https//www.noname.com
student4567 "GET http// www.noip.edu
student8901 "GET http// www.testing.org

$ sed 's/^\([^:]\+:[^:]\+\).*/\1/' file.txt            
student1234 "GET https://www.noname.com
student4567 "GET http:// www.noip.edu
student8901 "GET http:// www.testing.org

$ while IFS=: read -r i j k; do echo "$i$j"; done <file.txt
student1234 "GET https//www.noname.com
student4567 "GET http// www.noip.edu
student8901 "GET http// www.testing.org

$ perl -pe 's/^([^:]+:[^:]+).*/$1/' file.txt
student1234 "GET https://www.noname.com
student4567 "GET http:// www.noip.edu
student8901 "GET http:// www.testing.org

相关内容