我想在一个命令中指定文本文件的输出,如下所示:
- 打印包含所有连续重复字符的所有行。
- 打印包含同一行中除最后一个或最后两个字符之外的所有连续重复字符的所有行。
- 打印包含同一行中除第一个或前两个字符之外的所有连续重复字符的所有行。
例子:
11122323 1112266 44778 223334456 6778811 845511 3357788
输出应该是
1112266 >>>>> All repeated characters.
44778 >>>>> All repeated except the last character.
223334456 >>> All repeated except the last two characters
6778811 >>>> All repeated except the first character.
845511 >>>> All repeated except the first two characters.
允许使用非连续重复的字符,但前提是从行首或行尾算起的第一个或第二个字符。第一行被排除,因为它没有3
连续重复。
我尝试过以下命令,但它也找到非连续的重复字符。
awk '
{split ("", N) # delete N array
P = 1 # reset boolean L used for print decision
L = length
for (i=1; i<=L; i++) N[substr($0, i, 1)]+=((i<3)||(i>L-2))?2:1 # calculate char count; doubly weigh leading/trailing
for (n in N) if (N[n] < 2) {P = 0 # for non-duplicate chars: set print decision
break # and quit the for loop
}
}
P # print if non-duplicate chars exist only at margins
' file