如何使用 ksh 中的关键字 grep xml 文件中的 xml 块

如何使用 ksh 中的关键字 grep xml 文件中的 xml 块

我有一个文件 Sample.xml,其中包含很多服务,其结构如下所示

笔记:

  1. 我无法使用任何 XML 解析器工具,因为我没有权限,只读

  2. 我的xmllint版本不支持xpath,无法更新,只读

  3. 我没有 xmlstarlet 并且无法安装它

问题:输入:队列名称

输出:服务块

示例输入:ABC.getme2

所需输出:

<service name="GETME2" min="1" max="10" idleTime="300" backend="ABC">
                            <handlerContainer className="com.abc.xyz.wqere.abcqwere">
                            <handler className="com.abc.xyz.qweqweqwe.werwerwerwer"/>
                            </handlerContainer>
                            <mqListener queue="ABC.getme2" suggExpiry="30" minExpiry="4" maxExpiry="500" copyMessageId="true"/>
                    </service>

XML 结构:

     <?xml version="1.0" encoding="UTF-8"?>
        <deploymentconfig xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
                <configfile>sample.xml</configfile>
                <exceptionsFilterConfigFile>asdasd.xml</exceptionsFilterConfigFile>
                <keyInfoConfigFile>asdasd.xml</keyInfoConfigFile>
                <services>

    <service name="GETME" min="1" max="10" idleTime="300" backend="ABC">
                            <handlerContainer className="com.abc.xyz.wqere.abcqwere">
                            <handler className="com.abc.xyz.qweqweqwe.werwerwerwer"/>
                            </handlerContainer>
                            <mqListener queue="ABC.getme" suggExpiry="30" minExpiry="4" maxExpiry="500" copyMessageId="true"/>
                    </service>

    <service name="GETME2" min="1" max="10" idleTime="300" backend="ABC">
                            <handlerContainer className="com.abc.xyz.wqere.abcqwere">
                            <handler className="com.abc.xyz.qweqweqwe.werwerwerwer"/>
                            </handlerContainer>
                            <mqListener queue="ABC.getme2" suggExpiry="30" minExpiry="4" maxExpiry="500" copyMessageId="true"/>
                    </service>
        . . . .a lot of services like this . . . .
        . . . .a lot of services like this . . . .
        . . . .a lot of services like this . . . .
        . . . .a lot of services like this . . . .
        </services>
   <batchServices>
                        <batchService name="batch1">
                                <executor className="com.abc.xyz.qwer.qweqwewqe.ffdsdfsdfsdfsdf" />
                        </batchService>
                        <batchService name="batch2">
                                <executor className="com.abc.xyz.qwer.qweqwewqe.zxcsadsad" />
                        </batchService>
. . . .a lot of batch services like this . . . .
        . . . .a lot of batch services like this . . . .
        . . . .a lot of batch services like this . . . .
        . . . .a lot of batch services like this . . . .
      </batchServices>

<timerservices>
<timerservice> - a lot of timeservice
</timerservices>

  <connectionPools>
                <pool>
                        <name>asdasd</name>
                        <driver>oracle.jdbc.driver.OracleDriver</driver>
                        <url>$asdasd_URL</url>
                        <userId>$asdasd_USER</userId>
                        <password>$asdasd_PASSWORD</password>
                        <minConnections>0</minConnections>
                        <maxConnections>10</maxConnections>
                        <poolUrl>jdbc:asdsad:asdasdsad</poolUrl>
                        <testSql>select * from abc</testSql>
                </pool>

 . . a lot of pools. . .

</connectionpools>

</deploymentconfig>

我需要 grep 一个 xml 块,如下所示:

 <service name="GETME" min="1" max="10" idleTime="300" backend="ABC">
                        <handlerContainer className="com.abc.xyz.wqere.abcqwere">
                        <handler className="com.abc.xyz.qweqweqwe.werwerwerwer"/>
                        </handlerContainer>
                        <mqListener queue="ABC.getme" suggExpiry="30" minExpiry="4" maxExpiry="500" copyMessageId="true"/>
                </service>

我只需要提供队列名称

QUEUENAME=INSERT_HERE
grep ______________ $QUEUENAME. . . 

这是输出:

Usage : xmllint [options] XMLfiles ...
    Parse the XML files and output the result of the parsing
    --version : display the version of the XML library used
    --debug : dump a debug tree of the in-memory document
    --shell : run a navigating shell
    --debugent : debug the entities defined in the document
    --copy : used to test the internal copy implementation
    --recover : output what was parsable on broken XML documents
    --noent : substitute entity references by their value
    --noout : don't output the result tree
    --path 'paths': provide a set of paths for resources
    --load-trace : print trace of all external entites loaded
    --nonet : refuse to fetch DTDs or entities over network
    --nocompact : do not generate compact text nodes
    --htmlout : output results as HTML
    --nowrap : do not put HTML doc wrapper
    --valid : validate the document in addition to std well-formed check
    --postvalid : do a posteriori validation, i.e after parsing
    --dtdvalid URL : do a posteriori validation against a given DTD
    --dtdvalidfpi FPI : same but name the DTD with a Public Identifier
    --timing : print some timings
    --output file or -o file: save to a given file
    --repeat : repeat 100 times, for timing or profiling
    --insert : ad-hoc test for valid insertions
    --compress : turn on gzip compression of output
    --html : use the HTML parser
    --xmlout : force to use the XML serializer when using --html
    --push : use the push mode of the parser
    --memory : parse from memory
    --maxmem nbbytes : limits memory allocation to nbbytes bytes
    --nowarning : do not emit warnings from parser/validator
    --noblanks : drop (ignorable?) blanks spaces
    --nocdata : replace cdata section with text nodes
    --format : reformat/reindent the input
    --encode encoding : output in the given encoding
    --dropdtd : remove the DOCTYPE of the input docs
    --c14n : save in W3C canonical format (with comments)
    --exc-c14n : save in W3C exclusive canonical format (with comments)
    --nsclean : remove redundant namespace declarations
    --testIO : test user I/O support
    --catalogs : use SGML catalogs from $SGML_CATALOG_FILES
                 otherwise XML Catalogs starting from 
             file:///etc/xml/catalog are activated by default
    --nocatalogs: deactivate all catalogs
    --auto : generate a small doc on the fly
    --xinclude : do XInclude processing
    --noxincludenode : same but do not generate XInclude nodes
    --loaddtd : fetch external DTD
    --dtdattr : loaddtd + populate the tree with inherited attributes 
    --stream : use the streaming interface to process very large files
    --walker : create a reader and walk though the resulting doc
    --pattern pattern_value : test the pattern support
    --chkregister : verify the node registration code
    --relaxng schema : do RelaxNG validation against the schema
    --schema schema : do validation against the WXS schema
    --schematron schema : do validation against a schematron
    --sax1: use the old SAX1 interfaces for processing
    --sax: do not build a tree but work just at the SAX level

Libxml project home page: http://xmlsoft.org/
To report bugs or get some help check: http://xmlsoft.org/bugs.html

这是版本

xmllint: using libxml version 20626

答案1

如果您正在使用最新的 ksh- 我的意思是最近的版本ksh93- 你实际上可以使用它。ksh93支持化合物变量类型 - 有点像 C 结构 - 或 XML 节点树。目前它本身并不支持 XML——尽管我相信它是有计划的——但它json现在确实支持。

我用了一些免费在线转换器获得json样本输出的东西。尽管如此,在稍微清理一下你的样本之后(这p</连接池>顺便说一句,应该是大写的)我可以做:

print -j queue.services.[@name]

...并获得奖励...

GETME

我还可以这样做:

print -j queue.services[1].[@name]

...来代替...

GETME2

在链接的转换网站上我必须选择制表符分隔以防止它粘在很多不间断的空间中,但除此之外,它似乎还不错。当然,您可以在本地轻松使用一些工具来进行类似的转换。

无论如何,您可以像我一样将树复制到剪贴板后ksh在树中读取,例如:json

read -m json queue <<<"$(xsel -bo)"

完成此操作后,我可以查看整个结构,例如......

print -j queue

...打印...

{
    "batchServices": [
        {
            "@name": "batch1",
            "executor": {
                "@className": "com.abc.xyz.qwer.qweqwewqe.ffdsdfsdfsdfsdf"
            }
        },
        {
            "@name": "batch2",
            "executor": {
                "@className": "com.abc.xyz.qwer.qweqwewqe.zxcsadsad"
            }
        }
    ],
    "configfile": "sample.xml",
    "connectionPools": [
        {
            "driver": "oracle.jdbc.driver.OracleDriver",
            "maxConnections": "10",
            "minConnections": "0",
            "name": "asdasd",
            "password": "$asdasd_PASSWORD",
            "poolUrl": "jdbc:asdsad:asdasdsad",
            "testSql": "select * from abc",
            "url": "$asdasd_URL",
            "userId": "$asdasd_USER"
        }
    ],
    "exceptionsFilterConfigFile": "asdasd.xml",
    "keyInfoConfigFile": "asdasd.xml",
    "services": [
        {
            "@backend": "ABC",
            "@idleTime": "300",
            "@max": "10",
            "@min": "1",
            "@name": "GETME",
            "handlerContainer": {
                "@className": "com.abc.xyz.wqere.abcqwere",
                "handler": {
                    "@className": "com.abc.xyz.qweqweqwe.werwerwerwer"
                }
            },
            "mqListener": {
                "@copyMessageId": "true",
                "@maxExpiry": "500",
                "@minExpiry": "4",
                "@queue": "ABC.getme",
                "@suggExpiry": "30"
            }
        },
        {
            "@backend": "ABC",
            "@idleTime": "300",
            "@max": "10",
            "@min": "1",
            "@name": "GETME2",
            "handlerContainer": {
                "@className": "com.abc.xyz.wqere.abcqwere",
                "handler": {
                    "@className": "com.abc.xyz.qweqweqwe.werwerwerwer"
                }
            },
            "mqListener": {
                "@copyMessageId": "true",
                "@maxExpiry": "500",
                "@minExpiry": "4",
                "@queue": "ABC.getme2",
                "@suggExpiry": "30"
            }
        }
    ]
}

答案2

正如上面评论中提到的,xmllint可以像这样使用

xmllint --xpath '//service/[@name="GETME"]' Sample.xml

该选项至少从 libxml 版本 20903 开始​​可用。

有关 xpath 语法的入门知识可以在此处找到:http://www.w3schools.com/xpath/xpath_syntax.asp或者更权威https://www.w3.org/Consortium/Offices/Presentations/XSLT_XPATH/#(23)

答案3

具备条件:

  1. 我无法使用任何 XML 解析器工具,因为我没有权限,只读

  2. 我的 xmllint 版本不支持 xpath,我无法更新它,只读

  3. 我没有 xmlstarlet 并且无法安装它

我求助于寻找其他非传统的解决方案。这个 awk 命令满足了我的需要

awk '
  /<service.*name=/ { f=1 ; m=0 ; res="" }
  f { res = res $0 ORS }
  f && /mqListener queue="ABC.getme2"/ { m=1 }
  /<\/service>/ { f=0 ; if (m) print res $0 }
' Sample.xml

特别感谢@Janis 在这里帮助我 -当输入参数位于块的中间时,如何在获取 xml 块时实现 awk 范围模式

答案4

好的,首先也是最重要的 - 不要使用grep. XML 不是适合基于正则表达式的解析的格式。请改用 XML 解析器。

我最喜欢的 XML 解析器实际上是一个perl名为XML::Twig

#!/usr/bin/perl

use strict;
use warnings;

use XML::Twig;

my ($keyword, $filename) = @ARGV;

XML::Twig->new(
    'pretty_print'  => 'indented_a',
    'twig_handlers' => {
        'service[@name="'.$keyword.'"]' => sub { $_->print }
    }
)->parsefile($filename);

myscript.pl GETME yourxml它调用将打印任何匹配项。 (更改pretty_print为您喜欢的格式)。

XML::Twig实际上也捆绑了几个示例用例,例如xml_grep,它可能可以做很多你想要的事情。

使用上面的示例 XML(略有修改,因为它无效,并且我已经假定您的源 XML 实际上是这样的,这是一个转置错误)。

<service
        backend="ABC"
        idleTime="300"
        max="10"
        min="1"
        name="GETME">
      <handlerContainer className="com.abc.xyz.wqere.abcqwere">
        <handler className="com.abc.xyz.qweqweqwe.werwerwerwer" />
      </handlerContainer>
      <mqListener
          copyMessageId="true"
          maxExpiry="500"
          minExpiry="4"
          queue="ABC.getme"
          suggExpiry="30"
      />
    </service>

注意:此格式indented_aXML::Twig.其他可用。这至少部分说明了为什么正则表达式和基于行的 XML 匹配是危险的。

相关内容