如何从文件中提取列出超过 X 次的 IP 地址

如何从文件中提取列出超过 X 次的 IP 地址

我想提取符合以下要求的 IP 地址

  1. 包含 foo
  2. 在文件中列出超过 5 次

以下是我的日志的示例

2020/12/07 03:25:16 [error] 31385#31385: *4283 limiting requests, excess: 100.110 by zone "foo", client: 1.1.1.1, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31386#31386: *4107 limiting requests, excess: 100.962 by zone "bar", client: 1.1.1.2, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31386#31386: *4107 limiting requests, excess: 100.962 by zone "bar", client: 1.1.1.2, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31385#31385: *4164 limiting requests, excess: 100.102 by zone "foo", client: 1.1.1.1, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31386#31386: *4107 limiting requests, excess: 100.962 by zone "bar", client: 1.1.1.2, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31384#31384: *2404 limiting requests, excess: 100.080 by zone "foo", client: 1.1.1.1, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31386#31386: *4107 limiting requests, excess: 100.962 by zone "bar", client: 1.1.1.2, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31384#31384: *2321 limiting requests, excess: 100.062 by zone "foo", client: 1.1.1.1, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31386#31386: *4107 limiting requests, excess: 100.962 by zone "bar", client: 1.1.1.2, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31386#31386: *4220 limiting requests, excess: 100.020 by zone "foo", client: 1.1.1.1, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31385#31385: *4406 limiting requests, excess: 100.002 by zone "foo", client: 1.1.1.1, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31376#31376: *4172 limiting requests, excess: 100.996 by zone "foo", client: 1.1.1.1, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31386#31386: *4190 limiting requests, excess: 100.988 by zone "foo", client: 1.1.1.1, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31376#31376: *2549 limiting requests, excess: 100.984 by zone "foo", client: 1.1.1.1, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31386#31386: *4189 limiting requests, excess: 100.972 by zone "foo", client: 1.1.1.1, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31386#31386: *4107 limiting requests, excess: 100.962 by zone "bar", client: 1.1.1.2, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31386#31386: *4107 limiting requests, excess: 100.962 by zone "bar", client: 1.1.1.2, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"
2020/12/07 03:25:16 [error] 31386#31386: *4107 limiting requests, excess: 100.962 by zone "foo", client: 1.1.1.1, server: example.com, request: "POST /some-link HTTP/1.1", host: "www.example.com", referrer: "https://www.example.com/some-link"

结果应为

1.1.1.1

1.1.1.2 不应打印,因为它不属于该foo区域

我已经能够列出每个 IP 被列出的次数

grep -o "[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+" testfile | sort | uniq -c
     11 1.1.1.1
      7 1.1.1.2

但我不确定如何要求foo然后将列出超过 5 次的 IP 写入文件

答案1

使用 GNU awk:

gawk '
  /zone "foo"/ && match($0, /client: ([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)/,m) {
    count[m[1]]++
  } 
  END {
    for (client in count) {if (count[client] > 5) print client}
  }
' testfile

或者使用磨坊主(这更加具体,因为它将条目视为分隔key: value对,并分别将匹配限制在命名字段excessclient

mlr --dkvp --fs ', ' --ps ': ' \
  filter '$excess =~ "zone \"foo\""' then \
  put -q '@count[$client] += 1; end {for (client in @count) {if (@count[client] > 5){print client}}}
' testfile

答案2

我改变了你的命令行以获取每一行并使用 cut 单独参数,之后我只需要检查第一个字段是否大于 4

一行示例

while read -r proc; do val=`echo "$proc" | cut -d' ' -f1 `; ip=`echo $proc | cut -d' ' -f2`; if [ $val -gt 4 ]; then echo $ip; fi ; done <<< `grep -o "[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+" ipList.txt | sort | uniq -c`

多行以便更好地理解语法:

while read -r proc
do
    val=`echo "$proc" | cut -d' ' -f1 `
    ip=`echo $proc | cut -d' ' -f2`
    if [ $val -gt 4 ]
    then
        echo $ip
    fi
done <<< `grep -o "[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+" ipList.txt | sort | uniq -c`

相关内容