删除 Web 服务器日志文件中早于 X 天的行吗?

删除 Web 服务器日志文件中早于 X 天的行吗?

我在 Ubuntu 上使用默认的“主”日志格式运行 Nginx,它会生成如下输出:

95.108.181.102 - - [11/Feb/2018:11:43:10 +0000] "GET /blog/ HTTP/1.1" 200 4438 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" "-"

我有一个从不旋转的主日志文件,我将其与 GoAccess(日志解析/报告软件)一起使用。我希望删除该文件中包含超过 30 天左右日志条目的行。这可以完成吗?最好用 bash 语句吗?

我计划将其添加到现有的每日 cronjob 中以生成 30 天的滚动报告。我希望使用这样的东西,但我无法完全让它正确解析日志:sed -i '/<magical-invocation-goes-here> --date="-30 days"/d' example.log

答案1

GNUawk解决方案:

样本test.log

95.108.181.102 - - [11/Feb/2018:11:43:10 +0000] "GET /blog/ HTTP/1.1" 200 4438 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" "-"
95.108.181.102 - - [11/Aug/2017:11:43:10 +0000] "GET /blog/ HTTP/1.1" 200 4438 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" "-"
95.108.181.102 - - [01/Jan/2018:11:43:10 +0000] "GET /blog/ HTTP/1.1" 200 4438 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" "-"
95.108.181.102 - - [11/Feb/2018:11:43:10 +0000] "GET /blog/ HTTP/1.1" 200 4438 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" "-"

awk -v m1_ago=$(date -d"-1 month" +%s) \
'BEGIN{ 
     split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec", month);
     for (i in month) m_nums[month[i]] = i
 }
 { split(substr($4,2), a, "[/:]") }
 mktime(sprintf("%d %d %d %d %d %d", a[3], m_nums[a[2]], a[1], a[4], a[5], a[6])) > m1_ago
' test.log > tmp_log && mv tmp_log test.log

最终test.log内容:

95.108.181.102 - - [11/Feb/2018:11:43:10 +0000] "GET /blog/ HTTP/1.1" 200 4438 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" "-"
95.108.181.102 - - [11/Feb/2018:11:43:10 +0000] "GET /blog/ HTTP/1.1" 200 4438 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" "-"

相关内容