我有一个文本文件,我想在一个命令中分隔输出,如下所示:
- 打印具有所有连续重复字符的所有行。
- 打印包含同一行中除最后一个或最后两个字符之外的所有连续重复字符的所有行。
- 打印包含同一行中除第一个或前两个字符之外的所有连续重复字符的所有行。
示例:11122323 1112266 44778 223334456 6778811 845511 3357788
输出应该是
1112266 >>>>> All repeated characters.
44778 >>>>> All repeated except the last character.
223334456 >>> All repeated except the last two characters
6778811 >>>> All repeated except the first character.
845511 >>>> All repeated except the first two characters.
允许使用非重复字符,但前提是位于行首或行尾的第 1 个或第 2 个字符。第一行被排除,因为它的 #3 不是连续重复的。
答案1
稍微适应一下最近的回答对于你的类似问题:
awk '
{split ("", N) # delete N array
P = 1 # reset boolean L used for print decision
L = length
for (i=1; i<=L; i++) N[substr($0, i, 1)]+=((i<3)||(i>L-2))?2:1 # calculate char count; doubly weigh leading/trailing
for (n in N) if (N[n] < 2) {P = 0 # for non-duplicate chars: set print decision
break # and quit the for loop
}
}
P # print if non-duplicate chars exist only at margins
' file
1112266
44778
223334456
6778811
845511