如何在类似 XML 的文本文件中找到第 n 个包含单词的“

Question 1

$ awk -F"[<>]" '{for(i=2;i<=NF;i+=2){print ++j" - "$i}}' input.xml
1 - note
2 - to
3 - /to
4 - from
5 - /from
6 - heading
7 - /heading
8 - body
9 - /body
10 - /note

Answer

$ awk -F"[<>]" '{for(i=2;i<=NF;i+=2){print ++j" - "$i}}' input.xml
1 - note
2 - to
3 - /to
4 - from
5 - /from
6 - heading
7 - /heading
8 - body
9 - /body
10 - /note

Question 2

注意：这个答案是在用户解释 XML 格式不正确之前写的。我将其留在这里，因为它可能对其他人有帮助。

XML小星能够生成XML文档的元素结构：

$ xml el file.xml
note
note/to
note/from
note/heading
note/body

这与您的预期输出不同，但可能足以满足您想要实现的目标。

它还能够将 XML 转换为 PYX，在不同的行上显示开始和结束标记：

$ xml pyx file.xml
(note
-\n
(to
-Tove
)to
-\n
(from
-Jani
)from
-\n
(heading
-Reminder
)heading
-\n
(body
-Don't forget me this weekend!
)body
-\n
)note

由此，很容易得到您想要的输出：

$ xml pyx file.xml | sed -n -e 's/^(//p' -e 's/^)/\//p'| nl
     1  note
     2  to
     3  /to
     4  from
     5  /from
     6  heading
     7  /heading
     8  body
     9  /body
    10  /note

这些sed说明将删除不以或开头的行(，)并根据您在问题中指定的方式替换这些字符。该nl实用程序将行号放在行上。

XMLStarlet 有时安装xmlstarlet为xml.

Answer

注意：这个答案是在用户解释 XML 格式不正确之前写的。我将其留在这里，因为它可能对其他人有帮助。