我有一个具有以下输出的文件:
dn:可以有更多的rdcPositions。
我只需要 dn: ,它有一个 rdcPositions 包含 acme#6#
结果应该打印 cn 和 rdcPosition
dn: cn=00fa69bd-bede-4918-a017-b59b0901bb3d,ou=Named,ou=Identities,ou=Active,o
u=Vault,o=acme
rdcPosition: cn=1950,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>8946
702990</cn><reqdate>1529318977</reqdate><startdate>1529318977</startdate><end
date>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</ne
wstatus><date>1529318977</date></change><change><date>1529319116</date><previ
ousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>15
29481285</date><previousstatus>3</previousstatus><newstatus>6</newstatus></ch
ange></lifecycle></position>
dn: cn=010903cd-e92d-4307-bffc-4921379153c0,ou=Named,ou=Identities,ou=Active,o
u=Vault,o=acme
rdcPosition: cn=922445,ou=Entities,ou=Active,ou=Vault,o=acme#5#<position><cn>42
79084890</cn><reqdate>1429014997</reqdate><startdate>1429014997</startdate><e
nddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</
newstatus><date>1429014997</date></change><change><date>1429023084</date><pre
viousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>
1525107741</date><previousstatus>3</previousstatus><newstatus>6</newstatus></
change><change><date>1525126716</date><previousstatus>6</previousstatus><news
tatus>5</newstatus></change></lifecycle></position>
rdcPosition: cn=311982,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>97
26910833</cn><reqdate>1528120494</reqdate><startdate>1528120494</startdate><e
nddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</
newstatus><date>1528120494</date></change><change><date>1528123478</date><pre
viousstatus>1</previousstatus><newstatus>3</newstatus></change></lifecycle></
position>
dn: cn=01126aa4-af80-401b-8713-29e360868999,ou=Named,ou=Identities,ou=Active,o
u=Vault,o=acme
rdcPosition: cn=914570,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>20
68839799</cn><reqdate>1406284665</reqdate><startdate>1406284665</startdate><e
nddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</
newstatus><date>1406284665</date></change><change><date>1406284666</date><pre
viousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>
1435847283</date><previousstatus>3</previousstatus><newstatus>6</newstatus></
change></lifecycle></position>
rdcPosition: cn=999546,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>76
03071057</cn><reqdate>1400325753</reqdate><startdate>1400325753</startdate><e
nddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</
newstatus><date>1400325753</date></change><change><date>1400325754</date><pre
viousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>
1449224475</date><previousstatus>3</previousstatus><newstatus>6</newstatus></
change></lifecycle></position>
rdcPosition: cn=3513,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>2802
042129</cn><reqdate>1406284761</reqdate><startdate>1406284761</startdate><end
date>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</ne
wstatus><date>1406284761</date></change><change><date>1406284762</date><previ
ousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>14
49224599</date><previousstatus>3</previousstatus><newstatus>6</newstatus></ch
ange></lifecycle></position>
rdcPosition: cn=312936,ou=Entities,ou=Active,ou=Vault,o=acme#3#<position><cn>19
23461515</cn><reqdate>1449217172</reqdate><startdate>1449217172</startdate><e
nddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</
newstatus><date>1449217172</date></change><change><date>1449225081</date><pre
viousstatus>1</previousstatus><newstatus>3</newstatus></change></lifecycle></
position>
答案1
输入似乎是 LDIF,如中指定的RFC 2849。
我强烈推荐不是使用常用的 awk/sed/grep 工具链来处理 LDIF 的原因如下:
- 长属性值行(包括 dn:)用单个空格包裹,表示行继续。
- 包含非 ASCII 字符的属性值将进行 base64 编码。
最好的解决方案是为您最喜欢的脚本语言使用合适的 LDIF 解析器。
ldif
例如,对于 Python,请使用python-ldap 中的模块:
请参阅文档:ldif -- LDIF 解析器和生成器
答案2
您想要的输出不太清楚。这会让你走多远:
awk '
{while (match($0, /rdcPosition: [^ ]*acme#6#[^ ]*/)) {print substr ($0, RSTART, RLENGTH)
$0 = substr ($0, RSTART + RLENGTH);
}
}
' file
rdcPosition: cn=1950,ou=Entities,ou=Active,ou=Vault,o=acme#6#8946
rdcPosition: cn=311982,ou=Entities,ou=Active,ou=Vault,o=acme#6#97
rdcPosition: cn=914570,ou=Entities,ou=Active,ou=Vault,o=acme#6#20
rdcPosition: cn=999546,ou=Entities,ou=Active,ou=Vault,o=acme#6#76
rdcPosition: cn=3513,ou=Entities,ou=Active,ou=Vault,o=acme#6#2802
对于您在评论中更改的请求,这会让您走多远?如果不满意,请更具体地定义您想要的输出。
awk '
{DN = $1 FS $2
while (match($0, /rdcPosition: [^ ]*acme#6#[^ ]*/)) {print DN, substr ($0, RSTART, RLENGTH)
$0 = substr ($0, RSTART + RLENGTH);
}
}
' file
dn: cn=00fa69bd-bede-4918-a017-b59b0901bb3d,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme rdcPosition: cn=1950,ou=Entities,ou=Active,ou=Vault,o=acme#6#8946
dn: cn=010903cd-e92d-4307-bffc-4921379153c0,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme rdcPosition: cn=311982,ou=Entities,ou=Active,ou=Vault,o=acme#6#97
dn: cn=01126aa4-af80-401b-8713-29e360868999,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme rdcPosition: cn=914570,ou=Entities,ou=Active,ou=Vault,o=acme#6#20
dn: cn=01126aa4-af80-401b-8713-29e360868999,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme rdcPosition: cn=999546,ou=Entities,ou=Active,ou=Vault,o=acme#6#76
dn: cn=01126aa4-af80-401b-8713-29e360868999,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme rdcPosition: cn=3513,ou=Entities,ou=Active,ou=Vault,o=acme#6#2802
答案3
使用以下sed
脚本(假设我们使用 运行它sed -n
):
/^dn:/{ # this is a "dn" line
N; # append the next line
s/\n //; # remove the newline and the space
x; # exchange pattern space with hold space
/o=acme#6#/p; # print if pattern space contains our string
d; # delete from pattern space, start next cycle
}
/^rdcPosition:/{ # this is a "rdcPosition" line
:again; # define label for loop
N; # append the next line
s/\n //; # remove the newline and the space
\#</position>#!b again; # if the end tag "</position>" was not read, loop
/o=acme#6#/H; # append to hold space if matching what we're looking for
}
${ # at the very end of input
x; # exchange pattern and hold space
/o=acme#6#/p; # print if pattern space contains our string
}
sed
该脚本的作用本质上是在“保留空间”(在周期之间保留的通用缓冲区)中构建一个字符串。该字符串将从该dn
行开始,然后附加包含rdcPosition
我们感兴趣的特定字符串的行。
每当找到新dn
行,或者当我们位于输入末尾时,如果保留空间包含我们的字符串,则有条件地打印保留空间(如果rdcPosition
当前dn
行没有匹配的行,则它可能不包含它)。
测试它:
$ sed -n -f script.sed file
dn: cn=00fa69bd-bede-4918-a017b59b0901bb3d,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme
rdcPosition: cn=1950,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>8946702990</cn><reqdate>1529318977</reqdate><startdate>1529318977</startdate><enddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</newstatus><date>1529318977</date></change><change><date>1529319116</date><previousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>1529481285</date><previousstatus>3</previousstatus><newstatus>6</newstatus></change></lifecycle></position>
dn: cn=010903cd-e92d-4307-bffc-4921379153c0,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme
rdcPosition: cn=311982,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>9726910833</cn><reqdate>1528120494</reqdate><startdate>1528120494</startdate><enddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</newstatus><date>1528120494</date></change><change><date>1528123478</date><previousstatus>1</previousstatus><newstatus>3</newstatus></change></lifecycle></position>
dn: cn=01126aa4-af80-401b-8713-29e360868999,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme
rdcPosition: cn=914570,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>2068839799</cn><reqdate>1406284665</reqdate><startdate>1406284665</startdate><enddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</newstatus><date>1406284665</date></change><change><date>1406284666</date><previousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>1435847283</date><previousstatus>3</previousstatus><newstatus>6</newstatus></change></lifecycle></position>
rdcPosition: cn=999546,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>7603071057</cn><reqdate>1400325753</reqdate><startdate>1400325753</startdate><enddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</newstatus><date>1400325753</date></change><change><date>1400325754</date><previousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>1449224475</date><previousstatus>3</previousstatus><newstatus>6</newstatus></change></lifecycle></position>
rdcPosition: cn=3513,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>2802042129</cn><reqdate>1406284761</reqdate><startdate>1406284761</startdate><enddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</newstatus><date>1406284761</date></change><change><date>1406284762</date><previousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>1449224599</date><previousstatus>3</previousstatus><newstatus>6</newstatus></change></lifecycle></position>
逻辑上等效的脚本,产生与上面代码awk
相同的输出:sed
/^dn:/ {
if (hold ~ "o=acme#6#")
print hold
hold = $0;
getline
hold = hold substr($0, 2)
next
}
/^rdcPosition:/ {
line = $0
while (line !~ "</position>") {
getline
line = line substr($0, 2)
}
if (line ~ "o=acme#6#")
hold = hold ORS line
}
END {
if (hold ~ "o=acme#6#")
print hold
}
这些substr($0, 2)
调用将从输入中的断线中去除前导空格。
两个脚本都假设该dn
行正好分成两行。