我们的输入看起来像
2012-04-17 [GBPGBP]
2012-04-13 [GBP GBP]
2012-04-13 [GBP]
2012-04-11 [GBPGBP]
2012-04-11 [GBP GBP]
2012-04-10 [GBPGBP]
2012-04-06 [GBP GBP GBP]
2012-04-17 [GBPGBP]
2012-04-13 [GBP CDN]
2012-04-13 [GBP]
2012-04-11 [GBPCDN]
2012-04-11 [GBP DL DL]
2012-04-10 [PSGBP]
2012-04-06 [PS PS]
我们希望得到像这样的输出
2012-04-17 [GBP]
2012-04-13 [GBP]
2012-04-13 [GBP]
2012-04-11 [GBP]
2012-04-11 [GBP]
2012-04-10 [GBP]
2012-04-06 [GBP]
2012-04-17 [GBP]
2012-04-13 [GBP CDN]
2012-04-13 [GBP]
2012-04-11 [GBPCDN]
2012-04-11 [GBP DL]
2012-04-10 [PSGBP]
2012-04-06 [PS]
基本上删除括号内任何重复的字符串。有什么建议么?
答案1
sed -e ': a' -e 's/\(\[[^][]*\)\([A-Z][A-Z][A-Z]*\)\([^][]*\)\2/\1\2\3/' -e 't a'
: a
在脚本的开头设置一个标签。s/\(wibble\)\(foo\)\(bar\)\2/\1\2\3/
将 wibblefoobarfoo 替换为 wibblefoobar。[A-Z][A-Z][A-Z]*
匹配两个或多个字母t a
a
如果前一个s
命令进行了替换,则循环回到标签。