我有一个包含文本的日志文件:
Jan 10 09:56:17 1484207777.225918 GET "8.8.8.8" "curl/7.27.0" #0121484207777.226639 GET "8.8.8.9" "curl/7.21.0" #0121484207777.226639 GET "8.8.5.9" "curl/7.22.0"
Jan 10 19:59:17 1484207777.225456 GET "8.8.6.8" "curl/7.24.0" #0121484207777.226639 GET "8.8.5.9" "curl/7.21.0" #0121484207777.226425 GET "8.8.5.9" "curl/7.22.0"
我需要将符号“#”替换为换行符(\n)并从此行添加日期/时间。
我需要结果:
Jan 10 09:56:17 1484207777.225918 GET "8.8.8.8" "curl/7.27.0"
Jan 10 09:56:17 0121484207777.226639 GET "8.8.8.9" "curl/7.21.0"
Jan 10 09:56:17 0121484207777.226639 GET "8.8.5.9" "curl/7.22.0"
Jan 10 19:59:17 1484207777.225456 GET "8.8.6.8" "curl/7.24.0"
Jan 10 19:59:17 0121484207777.226639 GET "8.8.5.9" "curl/7.21.0"
Jan 10 19:59:17 0121484207777.226425 GET "8.8.5.9" "curl/7.22.0"
我尝试使用 sed,但没有结果。
for a in $(cat logs)
do
b=$(cat logs | awk '{print $1, $2, $3}')
echo "$a" | sed 's/#/\n"$b"/g'
done
你能帮助我完成这个任务吗?
答案1
如果您的日期字段后面是多种的空格,其他字段用单身的空格如您的示例所示,那么您可以这样做
$ awk -F' +' '{n = split($2,a,"#"); for (i=1;i<=n;i++) print $1,a[i]}' log
Jan 10 09:56:17 1484207777.225918 GET "8.8.8.8" "curl/7.27.0"
Jan 10 09:56:17 0121484207777.226639 GET "8.8.8.9" "curl/7.21.0"
Jan 10 09:56:17 0121484207777.226639 GET "8.8.5.9" "curl/7.22.0"
Jan 10 19:59:17 1484207777.225456 GET "8.8.6.8" "curl/7.24.0"
Jan 10 19:59:17 0121484207777.226639 GET "8.8.5.9" "curl/7.21.0"
Jan 10 19:59:17 0121484207777.226425 GET "8.8.5.9" "curl/7.22.0"
更一般地,你可以#
按如下方式替换
$ awk '{gsub(/#/, sprintf("\n%s %s %s ", $1, $2, $3))} 1' log
Jan 10 09:56:17 1484207777.225918 GET "8.8.8.8" "curl/7.27.0"
Jan 10 09:56:17 0121484207777.226639 GET "8.8.8.9" "curl/7.21.0"
Jan 10 09:56:17 0121484207777.226639 GET "8.8.5.9" "curl/7.22.0"
Jan 10 19:59:17 1484207777.225456 GET "8.8.6.8" "curl/7.24.0"
Jan 10 19:59:17 0121484207777.226639 GET "8.8.5.9" "curl/7.21.0"
Jan 10 19:59:17 0121484207777.226425 GET "8.8.5.9" "curl/7.22.0"
答案2
一个小的 Python 脚本可以完成这个工作:
#!/usr/bin/env python
from __future__ import print_function
import sys
for line in sys.stdin:
timestamp = "\n" + " ".join(line.strip().split()[0:3])
print(line.replace('#',timestamp),end="")
并演示其工作原理:
$ ./break_lines.py < input.txt
Jan 10 09:56:17 1484207777.225918 GET "8.8.8.8" "curl/7.27.0"
Jan 10 09:56:170121484207777.226639 GET "8.8.8.9" "curl/7.21.0"
Jan 10 09:56:170121484207777.226639 GET "8.8.5.9" "curl/7.22.0"
Jan 10 19:59:17 1484207777.225456 GET "8.8.6.8" "curl/7.24.0"
Jan 10 19:59:170121484207777.226639 GET "8.8.5.9" "curl/7.21.0"
Jan 10 19:59:170121484207777.226425 GET "8.8.5.9" "curl/7.22.0"
它的工作原理很简单 - 我们将行分成几个单词,然后取出前 3 个单词并将它们连接成一个字符串,该字符串前面附加有换行符,之后我们只需用#
该新字符串替换它 - 然后就大功告成了!