/var/log/apache2/other_vhosts_access.log
给定一个这样的文件:
example.com:443 1.1.1.1 - - [25/Jan/2021:12:00:00 +0000] "GET /abc/def/ghi?token=jklm12 HTTP/1.1" 200 1000 "-" "Mozilla/5.0 (Macintosh; Intel...
example.com:443 2.2.2.2 - - [25/Jan/2021:12:00:00 +0000] "GET /abc/def/ghi?token=jklm13 HTTP/1.1" 200 1000 "-" "Mozilla/5.0 (Macintosh; Intel...
example.com:443 33.33.33.33 - - [25/Jan/2021:12:00:00 +0000] "GET /abc/def/ghi?token=jklm14 HTTP/1.1" 200 1000 "-" "Mozilla/5.0 (Macintosh; Intel...
example.com:443 4.4.4.4 - - [25/Jan/2021:12:00:00 +0000] "GET /abc/def/ghi?token=jklm12 HTTP/1.1" 200 1000 "-" "Mozilla/5.0 (Macintosh; Intel...
如何聚合“按”URL 分组的 IP?
例子:
/abc/def/ghi?token=jklm12
1.1.1.1
4.4.4.4
/abc/def/ghi?token=jklm13
2.2.2.2
/abc/def/ghi?token=jklm14
33.33.33.33
我知道我们可能可以用来awk
提取某些列,但是如何进行“分组依据”呢?
答案1
awk '{a[$8]=a[$8] "\n\t" $2} END{for (url in a) print url, a[url]}' file
该数组a
最初是空的。
{a[$8]=a[$8] "\n\t" $2}
a[$8]
通过换行符和制表符扩展元素的值,后跟第二个字段。END
仅在解析整个文件后才会执行该块。对于数组中的每个键,都会打印键 (url
) 和关联的值 ( )。a[url]
输出:
/abc/def/ghi?token=jklm14
33.33.33.33
/abc/def/ghi?token=jklm12
1.1.1.1
4.4.4.4
/abc/def/ghi?token=jklm13
2.2.2.2