awk：隔离一个代码块，然后迭代多个代码块（如果存在）

Question 1

我没有在你的问题中看到预期的输出，所以我不确定，但你确实说过，Can awk † find the nth iteration of a "{" and return everything up to the next "}" character?这就是你想要做的（使用任何 awk 并假设}和{不能出现在你的输入中的其他任何地方）：

$ awk -v n=2 -v RS='}' 'NR==n{gsub(/.*\{\n|\n$/,""); print}' samp3.txt
first       "John"
address     "125 Main Street"
last    "Jacob"
age "30"
gender      "male"

如果你想在 shell 循环中调用它：

$ for i in {1..3}; do
    awk -v n="$i" -v RS='}' 'NR==n{gsub(/.*\{\n|\n$/,""); print}' samp3.txt
    echo "-----"
done
first       "John"
address     "124 Main Street"
last    "Jones"
special     "supervisor"
age "35"
gender      "male"
-----
first       "John"
address     "125 Main Street"
last    "Jacob"
age "30"
gender      "male"
-----
first       "John"
address     "523 Main Street"
last    "Jingle"
age "40"
gender      "male"
-----

但几乎可以肯定，有一种更好的方法可以完成您想做的任何事情，而不是在循环中多次调用 awk，例如，调用 awk 一次以终止符打印每个块}，然后将其读入 shell 数组以进行进一步处理：

$ readarray -d '}' -t arr < <(awk 'BEGIN{RS=ORS="}"} {gsub(/.*\{\n|\n$/,"")} $0~/[^[:space:]]/' samp3.txt)
$ for i in "${arr[@]}"; do printf '%s\n' "$i"; echo "-----"; done
first       "John"
address     "124 Main Street"
last    "Jones"
special     "supervisor"
age "35"
gender      "male"
-----
first       "John"
address     "125 Main Street"
last    "Jacob"
age "30"
gender      "male"
-----
first       "John"
address     "523 Main Street"
last    "Jingle"
age "40"
gender      "male"
-----

但实际上，无论您在 shell 循环中执行什么操作，也应该在对 awk 的一次调用中完成。

Answer

我没有在你的问题中看到预期的输出，所以我不确定，但你确实说过，Can awk † find the nth iteration of a "{" and return everything up to the next "}" character?这就是你想要做的（使用任何 awk 并假设}和{不能出现在你的输入中的其他任何地方）：

$ awk -v n=2 -v RS='}' 'NR==n{gsub(/.*\{\n|\n$/,""); print}' samp3.txt
first       "John"
address     "125 Main Street"
last    "Jacob"
age "30"
gender      "male"

如果你想在 shell 循环中调用它：

$ for i in {1..3}; do
    awk -v n="$i" -v RS='}' 'NR==n{gsub(/.*\{\n|\n$/,""); print}' samp3.txt
    echo "-----"
done
first       "John"
address     "124 Main Street"
last    "Jones"
special     "supervisor"
age "35"
gender      "male"
-----
first       "John"
address     "125 Main Street"
last    "Jacob"
age "30"
gender      "male"
-----
first       "John"
address     "523 Main Street"
last    "Jingle"
age "40"
gender      "male"
-----

但几乎可以肯定，有一种更好的方法可以完成您想做的任何事情，而不是在循环中多次调用 awk，例如，调用 awk 一次以终止符打印每个块}，然后将其读入 shell 数组以进行进一步处理：

$ readarray -d '}' -t arr < <(awk 'BEGIN{RS=ORS="}"} {gsub(/.*\{\n|\n$/,"")} $0~/[^[:space:]]/' samp3.txt)
$ for i in "${arr[@]}"; do printf '%s\n' "$i"; echo "-----"; done
first       "John"
address     "124 Main Street"
last    "Jones"
special     "supervisor"
age "35"
gender      "male"
-----
first       "John"
address     "125 Main Street"
last    "Jacob"
age "30"
gender      "male"
-----
first       "John"
address     "523 Main Street"
last    "Jingle"
age "40"
gender      "male"
-----

但实际上，无论您在 shell 循环中执行什么操作，也应该在对 awk 的一次调用中完成。

Question 2

我的代码做出的假设可能不正确，这意味着它在许多情况下可能会失败。可能还可以使用更有效的解决方案。

假设1每个GROUP块都由换行符分隔

假设2您希望在每个块执行一个操作

假设3每个GROUP块都会递增（如果不是，您最终可能会得到很多空文件。）

for i in {1..5}; do 
  awk -F"\n" -v RS="" -v inc="GROUP$i" '$0~inc{printf( "%s\n", $0); next}' $inputfile | sed  '/\/\|{\|}/d' > output$i.txt ; 
done

您的示例有GROUP1&4，我添加了一个GROUP5并编写了一个for循环，以从 1-5 的范围递增。该范围将在穿过块时用作关键GROUP。如果组较多，可以相应增加范围。

awk将在循环中使用来提取块。sed将清理（awk可以一次完成所有这些，但我仍在学习），然后将每个块写入其自己的输出文件，与GROUP块的编号相匹配。

输入文件

//GROUP1
{
first       "John"
address     "124 Main Street"
last    "Jones"
special     "supervisor"
age "35"
gender      "male"
}

//GROUP4
{
first       "John"
address     "125 Main Street"
last    "Jacob"
age "30"
gender      "male"
}
{
first       "John"
address     "523 Main Street"
last    "Jingle"
age "40"
gender      "male"
}

//GROUP5
{
first       "Maria"
address     "188 John Street"
last    "Phones"
special     "Supervisors supervisor"
age "35"
gender      "Female"
}

输出

cat output1.txt
first       "John"
address     "124 Main Street"
last    "Jones"
special     "supervisor"
age "35"
gender      "male"

cat output4.txt
first       "John"
address     "125 Main Street"
last    "Jacob"
age "30"
gender      "male"
first       "John"
address     "523 Main Street"
last    "Jingle"
age "40"
gender      "male"

cat output5.txt
first       "Maria"
address     "188 John Street"
last    "Phones"
special     "Supervisors supervisor"
age "35"
gender      "Female"

Answer