解析 XML 文件以将内容复制到另一个文件中的特定位置?

解析 XML 文件以将内容复制到另一个文件中的特定位置?

我有一个 xml 文件(client_23.xml),需要根据其他 xml 文件中的内容在特定部分添加几行(abc_lop.xml)

这是我的abc_lop.xml文件,您可以在其中看到很多ClientField带有name,pptypedataType的行。

<Hello version="100">
 <DataHolder numberOfFields="67">
  <ClientField name="target" pptype="aligning" dataType="string">
   <Value value="panel_3646"/>
   <Value value="panel_3653"/>
  </ClientField>
  <ClientField name="category_3652_0_count" pptype="symetrical" dataType="double"/>
  <ClientField name="category_3652_2_count" pptype="symetrical" dataType="double"/>
  <ClientField name="category_3646_0_count" pptype="symetrical" dataType="double"/>
  <ClientField name="pme.cdert" pptype="symetrical" dataType="double"/>
  <ClientField name="pme.age" pptype="symetrical" dataType="double"/>
  <ClientField name="category_3648_1_count" pptype="symetrical" dataType="double"/>
  <ClientField name="pme.number" pptype="symetrical" dataType="double"/>
  <ClientField name="pme.gender" pptype="aligning" dataType="string">
   <Value value=""/>
   <Value value="F   "/>
   <Value value="NA"/>
  </ClientField>
  <ClientField name="pme.status" pptype="aligning" dataType="string">
   <Value value=""/>
   <Value value="A"/>
   <Value value="S"/>
   <Value value="NA"/>
  </ClientField>
  <ClientField name="pme.selling_id" pptype="aligning" dataType="string">
   <Value value="c0"/>
   <Value value="c1"/>
   <Value value="NA"/>
  </ClientField>
 </DataHolder>
</Hello>

我需要读取这个文件并name从这些ClientField行中提取,如果不是pptypealigning那么我需要为每个名称构造这一行:下面是两个名称的示例,只有第一个值不同,除了其他两个值始终相同。

<eval>upsert("category_3652_0_count", 0, $calty_doubles)</eval>
<eval>upsert("category_3652_2_count", 0, $calty_doubles)</eval>

现在,如果它pptypealigning,则构建如下行:

<eval>upsert("target", "NA", $calty_strings)</eval>
<eval>upsert("pme.gender", "NA", $calty_strings)</eval>

我必须在client_23.xml文件中进行所有这些更改,然后用它创建一个新文件,这样我的新文件将如下所示:我将有一个带有名称的函数data_values,我需要在标签中添加上述内容,<block>如图所示以下。

    <function>
        <name>data_values</name>
        <variables>
            <variable>
            <name>temp</name>
            <type>double</type>
            </variable>
        </variables>
        <block>
            <eval>temp = 1</eval>
            <eval>upsert("category_3652_0_count", 0, $calty_doubles)</eval>
            <eval>upsert("category_3652_2_count", 0, $calty_doubles)</eval>
            <eval>upsert("target", "NA", $calty_strings)</eval>
            <eval>upsert("pme.gender", "NA", $calty_strings)</eval>             
        </block>
    </function>

这是我当前在client_23.xml文件中的内容,因此添加后,它将如下所示:

    <function>
        <name>data_values</name>
        <variables>
            <variable>
            <name>temp</name>
            <type>double</type>
            </variable>
        </variables>
        <block>
            <eval>temp = 1</eval>
        </block>
    </function>

我有一个非常简单的 shell 脚本,terdon 之前帮助过我,其中我们使用 perl 脚本,如下所示。我在文件中添加页眉和页脚abc_lop.xml,然后将其存储在file变量中,然后使用该文件值放置在client_23.xml文件中的特定部分,但我不确定如何执行上述操作。

脚本:-

for word in $client_types
do
    ## Concatenate the header, the contents of the target file and the
    ## footer into the variable $file.
    file=$(printf '%s\n%s\n%s' "$header" "$(cat "$path/${word}_lop.xml")" "$footer")

    ## Edit the target file and print
    perl -0pe "s#<eval>planting_model = 0</eval>#<eval>planting_model = 1</eval> s#<trestra-config>.* </trestra-config>##sm;   s#<function>\s*<name>DUMMY_FUNCTION.+?</function>#$file#sm" client_"$client_id".xml > "$word"_new_file.xml
done

这里client_types会是这样的:abc def pqr$client_id23。

现在我需要添加上述功能,我不确定如何才能轻松做到这一点?

答案1

这是在您删除重复问题之前我尝试在 Stack Overflow 上发布的 Perl 解决方案

use strict;
use warnings;

use XML::LibXML;

# Open the main XML file and locate the
# <block> element that we need to insert into
#
my $doc = XML::LibXML->load_xml(
    location => 'client_23.xml',
    no_blanks => 1,
);
my $block = $doc->find('/function/block')->get_node(1);

# Open the secondary XML file and find all the <ClientField> elements
# that contain the data we need to insert
#
my $abc = XML::LibXML->load_xml(location => 'abc_lop.xml');

for my $field ( $abc->find('/Hello/DataHolder/ClientField')->get_nodelist ) {

    my ($name, $pptype) = map $field->getAttribute($_), qw/ name pptype /;

    my $text = $pptype eq 'aligning' ?
        sprintf q{upsert("%s", "NA", $calty_strings)}, $name :
        sprintf q{upsert("%s", 0, $calty_doubles)}, $name;

    $block->appendTextChild('eval' , $text);
}

print $doc->toString(2);

输出

<?xml version="1.0"?>
<function>
  <name>data_values</name>
  <variables>
    <variable>
      <name>temp</name>
      <type>double</type>
    </variable>
  </variables>
  <block>
    <eval>temp = 1</eval>
    <eval>upsert("target", "NA", $calty_strings)</eval>
    <eval>upsert("category_3652_0_count", 0, $calty_doubles)</eval>
    <eval>upsert("category_3652_2_count", 0, $calty_doubles)</eval>
    <eval>upsert("category_3646_0_count", 0, $calty_doubles)</eval>
    <eval>upsert("pme.cdert", 0, $calty_doubles)</eval>
    <eval>upsert("pme.age", 0, $calty_doubles)</eval>
    <eval>upsert("category_3648_1_count", 0, $calty_doubles)</eval>
    <eval>upsert("pme.number", 0, $calty_doubles)</eval>
    <eval>upsert("pme.gender", "NA", $calty_strings)</eval>
    <eval>upsert("pme.status", "NA", $calty_strings)</eval>
    <eval>upsert("pme.selling_id", "NA", $calty_strings)</eval>
  </block>
</function>

答案2

但不是 perl,我更喜欢使用 python 来快速修改 xml。例如:

import xml.etree.ElementTree as ET

file1 = sys.argv[1]
file2 = sys.argv[2]

abc = ET.parse(file1).getroot()
xml2 = ET.parse(file2).getroot()

# For ClientField[name] properties
l = []

block = xml2.find('block')

for node in abc.findall("*/ClientField"):
    if node.attrib['pptype'] == 'aligning':
        ET.SubElement(block, 'eval').text = 'upsert("' + node.get('name') + '", "NA", $calty_strings)'
    else:
        ET.SubElement(block, 'eval').text = 'upsert("' + node.get('name') + '", 0, $calty_doubles)'

print(ET.tostring(xml2))

这会给你:

<function>
    <name>data_values</name>
    <variables>
        <variable>
            <name>temp</name>
            <type>double</type>
        </variable>
    </variables>
    <block>
        <eval>temp = 1</eval>
        <eval>upsert("target", "NA", $calty_strings)</eval>
        <eval>upsert("category_3652_0_count", 0, $calty_doubles)</eval>
        <eval>upsert("category_3652_2_count", 0, $calty_doubles)</eval>
        <eval>upsert("category_3646_0_count", 0, $calty_doubles)</eval>
        <eval>upsert("pme.cdert", 0, $calty_doubles)</eval>
        <eval>upsert("pme.age", 0, $calty_doubles)</eval>
        <eval>upsert("category_3648_1_count", 0, $calty_doubles)</eval>
        <eval>upsert("pme.number", 0, $calty_doubles)</eval>
        <eval>upsert("pme.gender", "NA", $calty_strings)</eval>
        <eval>upsert("pme.status", "NA", $calty_strings)</eval>
        <eval>upsert("pme.selling_id", "NA", $calty_strings)</eval>
    </block>
</function>

编辑:shell 脚本将如下所示:

client_id=23

for word in $client_types
do
    python converter.py $path/${word}_lop.xml client_"$client_id".xml
done

相关内容