为什么我的脚本会出现此错误？awk：script.awk:19：“语法错误

Question 1

与其尝试顺序读取并基于FNR/进行控制NR，为什么不使用getline读取2.txt并拆分';'，然后构建输出字符串（o如下）来连接每行中唯一的组件？您可以做类似的事情：

awk '{
        printf "%s", $0
    }
    /^BB/ {
        o = ""
        while (getline tmp < "2.txt") {
            n = split (tmp,arr,";")
            for (i=1; i<=n; i++)
                if(!match($0,arr[i]) && !match(o,arr[i]))
                    o=o arr[i]";"
        }
        printf "%s", o
    }
    {
        print ""
    }
' 1.txt

使用/输出示例

1.txt使用和中的示例数据2.txt（您1.txt再次错误命名），您将收到：

$ awk '{
>         printf "%s", $0
>     }
>     /^BB/ {
>         o = ""
>         while (getline tmp < "2.txt") {
>             n = split (tmp,arr,";")
>             for (i=1; i<=n; i++)
>                 if(!match($0,arr[i]) && !match(o,arr[i]))
>                     o=o arr[i]";"
>         }
>         printf "%s", o
>     }
>     {
>         print ""
>     }
> ' 1.txt
AA;00000;
BB;11111;KK;WW;55555;FF;ZZ;RR;YY;
GG;22222;

这看上去就像你想要的。

作为以两个文件名作为参数的脚本

Windows 应该遵循相同的约定ARGV。请注意，在脚本中运行时，不要在规则周围加上单引号awk，例如

#!/usr/bin/awk -f 

NR != FNR {
    exit
}
{
    printf "%s", $0
}
/^BB/ {
    o = ""
    while (getline tmp < ARGV[2]) {
        n = split (tmp,arr,";")
        for (i=1; i<=n; i++)
            if(!match($0,arr[i]) && !match(o,arr[i]))
                o=o arr[i]";"
    }
    printf "%s", o
}
{
    print ""
}

（笔记：您需要将/usr/bin/awk解释器更改为您拥有的任何解释器）

用法是，例如./test.awk 1.txt 2.txt

如果有帮助的话请告诉我。

Answer

与其尝试顺序读取并基于FNR/进行控制NR，为什么不使用getline读取2.txt并拆分';'，然后构建输出字符串（o如下）来连接每行中唯一的组件？您可以做类似的事情：

awk '{
        printf "%s", $0
    }
    /^BB/ {
        o = ""
        while (getline tmp < "2.txt") {
            n = split (tmp,arr,";")
            for (i=1; i<=n; i++)
                if(!match($0,arr[i]) && !match(o,arr[i]))
                    o=o arr[i]";"
        }
        printf "%s", o
    }
    {
        print ""
    }
' 1.txt

使用/输出示例

1.txt使用和中的示例数据2.txt（您1.txt再次错误命名），您将收到：

$ awk '{
>         printf "%s", $0
>     }
>     /^BB/ {
>         o = ""
>         while (getline tmp < "2.txt") {
>             n = split (tmp,arr,";")
>             for (i=1; i<=n; i++)
>                 if(!match($0,arr[i]) && !match(o,arr[i]))
>                     o=o arr[i]";"
>         }
>         printf "%s", o
>     }
>     {
>         print ""
>     }
> ' 1.txt
AA;00000;
BB;11111;KK;WW;55555;FF;ZZ;RR;YY;
GG;22222;

这看上去就像你想要的。

作为以两个文件名作为参数的脚本

Windows 应该遵循相同的约定ARGV。请注意，在脚本中运行时，不要在规则周围加上单引号awk，例如

#!/usr/bin/awk -f 

NR != FNR {
    exit
}
{
    printf "%s", $0
}
/^BB/ {
    o = ""
    while (getline tmp < ARGV[2]) {
        n = split (tmp,arr,";")
        for (i=1; i<=n; i++)
            if(!match($0,arr[i]) && !match(o,arr[i]))
                o=o arr[i]";"
    }
    printf "%s", o
}
{
    print ""
}

（笔记：您需要将/usr/bin/awk解释器更改为您拥有的任何解释器）

用法是，例如./test.awk 1.txt 2.txt

如果有帮助的话请告诉我。

Question 2

使用关联数组的键可以方便地处理重复项。这需要 GNU awk 来实现多维数组

BEGIN { FS = OFS = ";" }
NR == FNR {
    for (i=1; i<NF-1; i++)
        f2[$(NF-1)][$i] = ++n
    next
}
FNR == 1 {
    # this joins all the 2nd-level indices
    # the order of them is undefined.
    for (x in f2) {
        s = ""
        for (y in f2[x])
            s = s y OFS
        a[x] = s
    }
}
$(NF - 1) in a { $NF = a[$(NF-1)] }
1

然后

gawk -f script.awk {2,1}.txt

生产

AA;00000;
BB;11111;55555;WW;KK;RR;YY;FF;ZZ;
GG;22222;

我需要更多证据证明它对 URL“不起作用”：

$ cat 1.txt
AA;http://a.o/f/i.p?t=00000;
BB;http://a.o/f/i.p?t=11111;
GG;http://a.o/f/i.p?t=22222;

$ cat 2.txt
KK;WW;55555;http://a.o/f/i.p?t=11111;
KK;FF;ZZ;http://a.o/f/i.p?t=11111;
KK;RR;YY;http://a.o/f/i.p?t=11111;

$ gawk -f script.awk {2,1}.txt
AA;http://a.o/f/i.p?t=00000;
BB;http://a.o/f/i.p?t=11111;55555;WW;KK;RR;YY;FF;ZZ;
GG;http://a.o/f/i.p?t=22222;

Answer

使用关联数组的键可以方便地处理重复项。这需要 GNU awk 来实现多维数组

BEGIN { FS = OFS = ";" }
NR == FNR {
    for (i=1; i<NF-1; i++)
        f2[$(NF-1)][$i] = ++n
    next
}
FNR == 1 {
    # this joins all the 2nd-level indices
    # the order of them is undefined.
    for (x in f2) {
        s = ""
        for (y in f2[x])
            s = s y OFS
        a[x] = s
    }
}
$(NF - 1) in a { $NF = a[$(NF-1)] }
1

然后

gawk -f script.awk {2,1}.txt

生产

AA;00000;
BB;11111;55555;WW;KK;RR;YY;FF;ZZ;
GG;22222;

我需要更多证据证明它对 URL“不起作用”：

$ cat 1.txt
AA;http://a.o/f/i.p?t=00000;
BB;http://a.o/f/i.p?t=11111;
GG;http://a.o/f/i.p?t=22222;

$ cat 2.txt
KK;WW;55555;http://a.o/f/i.p?t=11111;
KK;FF;ZZ;http://a.o/f/i.p?t=11111;
KK;RR;YY;http://a.o/f/i.p?t=11111;

$ gawk -f script.awk {2,1}.txt
AA;http://a.o/f/i.p?t=00000;
BB;http://a.o/f/i.p?t=11111;55555;WW;KK;RR;YY;FF;ZZ;
GG;http://a.o/f/i.p?t=22222;

为什么我的脚本会出现此错误？awk：script.awk:19：“语法错误

1.txt

2.txt

答案1

答案2

相关内容