大家早,
我是 UNIX 的新手,所以希望做一些我可以在 VB 中做的事情,但没有在 UNIX 中做的经验。
我有一个共享的 xml 规范,需要在新的路透社 RIC 代码上线时定期删除和更新。要实现的两个项目:
A. 删除 RIC 条目
- 打开文件
- 查找特定字符串
- 删除找到的这一行及其下面的 21 行
- 保存存档
我认为这可能有效:
sed –e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml >a.xml
B. 添加新的 RIC 条目
- 打开文件
- 查找最后一次出现的刺痛
</source>
- 向上移动 29 行到最后一个 RIC 条目块
- 复制这一行和下面的 21 行(ric 块)
- 在下面插入一个新行 22 行并粘贴此块(一个新块),即直接粘贴到您复制的块下方
- 将新块第 1 行的 ric 更改为新的 Ric 字符串,
<ricid="AAAAA=YBAU"
即<ricid="BBBBB=YBAU"
- 保存存档
我怎样才能做到这一点?
这是文件的最后一部分。请注意,ric 块(我想要操作的)的末尾是出现以下字符串的时候。 。 。
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250640E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
因此,对于 A. 删除我想要删除 AUG03250640E=YBAU 的 RIC 条目,文件将显示:
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
对于 B. 添加新的 RIC 条目,假设我要添加新的 ric AUG03250641E=YBAU,文件将显示:
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250640E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250641E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
答案1
为了A与 POSIXsed
和head
实用程序以及常规in
文件:
{ sed -ne'/^<ric id="AUG03250639E=YBAU">$/q;p'
head -n21 >/dev/null
cat
} <in >out
答案2
除了解析和/或操作这一事实带有正则表达式的 xml充其量是误导(尽管如果你小心的话,这样简单的事情也可以工作),你几乎是对的:
使用 GNU sed:
sed –i -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml
如果您的 sed 不支持-i
( --in-place
) 选项,您可以使用临时文件来完成此操作(这就是 sed 在幕后的工作方式):
TF=$(mktemp)
sed -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml > "$TF" && mv "$TF" a.xml
您无法像您尝试的那样在 shell 中读取文件并将输出重定向到该文件 - shell 要做的第一件事是覆盖该文件,使其为空,这发生在 sed 脚本运行之前。
对于更复杂的 XML 解析任务,请使用 shell 脚本中的 XML 解析器xmlstarlet
或用于 perl 或 python(或几乎任何其他您能想到的语言)的 XML 解析库之一
答案3
不要使用正则表达式来解析 XML。它很脏——容易损坏并创建脆弱的代码。有很多事情可能会让您遇到麻烦,例如行数 - 在 XML 中格式化元素是完全有效的:
<calculation name="AB" field="SEC_YLD_1" />
或者:
<calculation
field="SEC_YLD_1"
name="AB"
/>
以及各种其他选项 - 所有这些选项在语义上都是相同的,但是......不会匹配相同的正则表达式。
对于您的示例,如果您使用解析器,这将非常简单。perl
有XML::Twig
哪个可以轻松做到这一点:
删除:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
$_ -> delete for $twig -> get_xpath('//ric[@id="AUG03250639E=YBAU"]');
$twig -> set_pretty_print('indented');
$twig -> print;
注意 - 删除重复项(如果存在)。
现在,创建一个新的 - 看起来你正在尝试复制和修改 - 所以:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
#find one to copy - this will just get the first 'ric' element.
my $ric_to_copy = $twig -> get_xpath('//ric',0);
#copy it
my $new_ric = $ric_to_copy -> copy;
#alter the new one
$new_ric -> set_att('id', 'BBBBB=YBAU' );
#paste it
$new_ric -> paste ( 'last_child', $ric_to_copy->parent);
$twig -> set_pretty_print('indented');
$twig -> print;
现在读取并打印到 STDOUT - 您可以打印到特定文件:
my ( $output, '>', 'new.xml') or die $!;
print {$output} $twig -> sprint;
close ( $output );