输入内容:
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetorgperson
objectClass: org-abc
objectClass: org-xyz
objectClass: top
objectClass: inetOrgPerson
objectClass: org-abc
objectClass: organizationalPerson
objectClass: person
objectClass: top
objectClass: org-abc
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: org-xyz
objectClass: top
objectClass: inetOrgPerson
objectClass: org-xyz
objectClass: organizationalPerson
objectClass: person
读取大小为 50 MB 的 LDIF 文件。
两个换行符之间的内容被视为堵塞。
- 如果这两行 (objectClass: org-abc & objectClass: org-xyz) 以任意顺序出现在堵塞,然后删除 BLOCK 中的这两行并添加新行“objectClass: org-111”
(或者)
- 如果此行“objectClass: org-abc”单独存在于堵塞,然后将该行替换为“objectClass: org-222”
(或者)
- 如果此行“objectClass: org-xyz”单独存在于堵塞,然后将该行替换为“objectClass: org-333”
预期输出:
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetorgperson
objectClass: org-111
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: org-222
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: org-111
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: org-333
如何使用 Linux 命令(sed 或 awk)获得此类输出或建议我更好的方法?
答案1
复杂的AWK
解决方案:
awk 'function process(a,c) { # process the lines of one passed block
for (i=1; i<=c; i++) {
split(a[i], fields); # split the line into 2 fields
if (fields[2]=="org-abc") abc="222";
else if (fields[2]=="org-xyz") xyz="333";
else print a[i]
}
if (abc || xyz) printf "objectClass: org-%s\n",(abc && xyz? "111" : (abc? "222":"333"))
}
!NF{ process(a, c); c=abc=xyz=0 }
{ a[++c]=$0 }
END{ process(a, c) }' file
这是记忆充足的解决方案,因为数组a
将保存一个单一的行堵塞仅在整个处理时间内。 (计数器c
在每次下一次重置堵塞)
输出:
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetorgperson
objectClass: org-111
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: org-222
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: org-111
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: org-333
答案2
这是 Perl 的“段落模式”( ) 的典型用例,-00
其中“行”由 定义\n\n
,因此每个段落都被视为一行:
$ perl -00 -lpe 'if(/: org-abc/ && /: org-xyz/){
s/(^|\n)[^\n]+: (org-abc|org-xyz)\s*(?=$|\n)//g;
s/$/\nobjectClass: org-111/;
}
else{
s/objectClass: org-abc/objectClass: org-222/;
s/objectClass: org-xyz/objectClass: org-333/
}' file
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetorgperson
objectClass: org-111
objectClass: top
objectClass: inetOrgPerson
objectClass: org-222
objectClass: organizationalPerson
objectClass: person
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: org-111
objectClass: top
objectClass: inetOrgPerson
objectClass: org-333
objectClass: organizationalPerson
objectClass: person
为了清楚起见,以下是未压缩为脚本的相同内容:
#!/usr/bin/env perl
## Paragraph mode
local $/="\n\n";
my $pat1 = 'objectClass: org-abc';
my $pat2 = 'objectClass: org-xyz';
## Read input file
while (my $line = <>) {
## Remove trailing newlines
chomp($line);
if($line =~ /$pat1/ && $line=~ /$pat2/){
$line =~ s/(^|\n)($pat1|pat2)\s*(?=$|\n)//g;
$line =~ s/$/\nobjectClass: org-111/;
}
else{
$line =~ s/$pat1/objectClass: org-222/;
$line =~ s/$pat2/objectClass: org-333/
}
print "$line\n\n";
}
答案3
也很容易sed
:
sed '/^$/!{H;1h;$!d;};x
/objectClass: org-abc/!{s/\(objectClass: org-\)xyz/\1333/;p;d;}
s/\(objectClass: org-\)xyz/\1111/;t1
s/\(objectClass: org-\)abc/\1222/;:b
:1
s/\nobjectClass: org-abc//'
第一行是收集模式空间中的一个块,其余的则进行明显的替换。