需要 awk/sed 的帮助来识别/标记重复的 IP 地址

需要 awk/sed 的帮助来识别/标记重复的 IP 地址

再会。

我有一个文本文件,其中包含 Pod/节点名称和关联的 IPv6 地址,其中两个 Pod 具有相同的 IP 地址,第一个 Podk8-worker0001c-cif-9d86d6dd4-vf9b9和最后一个吊舱k8-worker0001c-ctdc-5bc95b699f-xnmrn,IP地址是2001:1890:e00f:3900::6

k8-worker0001c-cif-9d86d6dd4-vf9b9
         2001:1890:e00f:3900::4/64 global nodad
         2001:1890:e00f:3900::6/64 global 

k8-worker0001c-cifpartner-64c89f8bc8-8p5pq
         2001:1890:e00f:3900::10/64 global 

k8-worker0001c-ctd-7d759784ff-2gk5d
         2001:1890:e00f:3900::a/64 global nodad
         2001:1890:e00f:3900::d/64 global 

k8-worker0001c-ctd-7d759784ff-hd8jp
         2001:1890:e00f:3900::c/64 global 

k8-worker0001c-ctd-7d759784ff-qkk4t
         2001:1890:e00f:3900::8/64 global nodad
         2001:1890:e00f:3900::f/64 global 

k8-worker0001c-ctd-7d759784ff-t6lwz
         2001:1890:e00f:3900::5/64 global 

k8-worker0001c-ctd-7d759784ff-vl8x9
         2001:1890:e00f:3900::9/64 global nodad
         2001:1890:e00f:3900::b/64 global 

k8-worker0001c-ctdc-5bc95b699f-xnmrn
         2001:1890:e00f:3900::7/64 global nodad
         2001:1890:e00f:3900::6/64 global

我所需要的只是一行代码来识别重复的 IP 地址,同时保留其余部分,包括 pod 名称。我尝试过使用awk !见过但这会删除我不想要的重复项。

因此我想要这样的东西:

k8-worker0001c-cif-9d86d6dd4-vf9b9
         2001:1890:e00f:3900::4/64 global nodad
         2001:1890:e00f:3900::6/64 global          DUPLICATE!

k8-worker0001c-cifpartner-64c89f8bc8-8p5pq
         2001:1890:e00f:3900::10/64 global 

k8-worker0001c-ctd-7d759784ff-2gk5d
         2001:1890:e00f:3900::a/64 global nodad
         2001:1890:e00f:3900::d/64 global 

k8-worker0001c-ctd-7d759784ff-hd8jp
         2001:1890:e00f:3900::c/64 global 

k8-worker0001c-ctd-7d759784ff-qkk4t
         2001:1890:e00f:3900::8/64 global nodad
         2001:1890:e00f:3900::f/64 global 

k8-worker0001c-ctd-7d759784ff-t6lwz
         2001:1890:e00f:3900::5/64 global 

k8-worker0001c-ctd-7d759784ff-vl8x9
         2001:1890:e00f:3900::9/64 global nodad
         2001:1890:e00f:3900::b/64 global 

k8-worker0001c-ctdc-5bc95b699f-xnmrn
         2001:1890:e00f:3900::7/64 global nodad
         2001:1890:e00f:3900::6/64 global         DUPLICATE!

提前致谢,比约恩

答案1

在每个 Unix 机器上的任何 shell 中使用任何 awk 的 2 遍方法:

$ $ awk '{sub(/\r$/,"")} NR==FNR{cnt[$1]++; next} {print $0 (NF && cnt[$1]>1 ? "\tDUPLICATE!" : "")}' file file
k8-worker0001c-cif-9d86d6dd4-vf9b9
         2001:1890:e00f:3900::4/64 global nodad
         2001:1890:e00f:3900::6/64 global       DUPLICATE!

k8-worker0001c-cifpartner-64c89f8bc8-8p5pq
         2001:1890:e00f:3900::10/64 global

k8-worker0001c-ctd-7d759784ff-2gk5d
         2001:1890:e00f:3900::a/64 global nodad
         2001:1890:e00f:3900::d/64 global

k8-worker0001c-ctd-7d759784ff-hd8jp
         2001:1890:e00f:3900::c/64 global

k8-worker0001c-ctd-7d759784ff-qkk4t
         2001:1890:e00f:3900::8/64 global nodad
         2001:1890:e00f:3900::f/64 global

k8-worker0001c-ctd-7d759784ff-t6lwz
         2001:1890:e00f:3900::5/64 global

k8-worker0001c-ctd-7d759784ff-vl8x9
         2001:1890:e00f:3900::9/64 global nodad
         2001:1890:e00f:3900::b/64 global

k8-worker0001c-ctdc-5bc95b699f-xnmrn
         2001:1890:e00f:3900::7/64 global nodad
         2001:1890:e00f:3900::6/64 global       DUPLICATE!

这只是假设您的 Pod 名称是唯一的。

答案2

Perl这是使用 double的单行 + 的镜头xargs

perl -MRegexp::Common -lnE '
    $h{$&}++ if /($RE{net}{IPv6})/;
    END{ print grep { $h{$_} > 1 } keys %h}
' file | xargs -I{} perl -spe 's/$ip.*/$&\t\tDUPLICATE!/g' -- -ip={} file

输出

k8-worker0001c-cif-9d86d6dd4-vf9b9
         2001:1890:e00f:3900::4/64 global nodad
         2001:1890:e00f:3900::6/64 global     DUPLICATE!

k8-worker0001c-cifpartner-64c89f8bc8-8p5pq
         2001:1890:e00f:3900::10/64 global 

k8-worker0001c-ctd-7d759784ff-2gk5d
         2001:1890:e00f:3900::a/64 global nodad
         2001:1890:e00f:3900::d/64 global 

k8-worker0001c-ctd-7d759784ff-hd8jp
         2001:1890:e00f:3900::c/64 global 

k8-worker0001c-ctd-7d759784ff-qkk4t
         2001:1890:e00f:3900::8/64 global nodad
         2001:1890:e00f:3900::f/64 global 

k8-worker0001c-ctd-7d759784ff-t6lwz
         2001:1890:e00f:3900::5/64 global 

k8-worker0001c-ctd-7d759784ff-vl8x9
         2001:1890:e00f:3900::9/64 global nodad
         2001:1890:e00f:3900::b/64 global 

k8-worker0001c-ctdc-5bc95b699f-xnmrn
         2001:1890:e00f:3900::7/64 global nodad
         2001:1890:e00f:3900::6/64 global    DUPLICATE!

要求

你需要 Perl 的模块Regexp::Common

您可以通过以下方式安装:

cpan Regexp::Common

或通过包管理器:

apt install libregexp-common-perl

适用于 Debian 及其衍生产品

如果您无法安装系统范围的软件包,您可以使用perlbrew以普通用户身份安装或以普通用户身份下载库:

wget https://sputnick.fr/downloads/Regexp-Common.gz
tar xjvf Regexp-Common.gz
ls ./lib

然后,您可以使用:

perl -I./lib -MRegexp::Common -lnE '
    $h{$&}++ if /($RE{net}{IPv6})/;
    END{ print grep { $h{$_} > 1 } keys %h}
' file | xargs -I{} perl -spe 's/$ip.*/$&    DUPLICATE!/g' -- -ip={} file

答案3

这是一个非常简单的基于 Perl 的方法。它假设输入保持稳定,即以空格开头的行在空格之后的第一个字段中具有 IP 地址。它应该与基本 Perl 安装一起使用,即不需要模块。

  perl -ne 's/\r//g; m/^\s+(\S+)\s+/ && $seen{$1}++; push @l, $_;
    END {$s=" "x7; foreach (@l) {m/^\s+(\S+)\s+/ && ($seen{$1}>1) && s/$/ $s DUPLICATE!/; print}}'

它生成与示例输出相同的二进制输出。 :)

相关内容