我有一个如下所示的文本。
<DIV>SOFTWARE V1.0.1.0.RDZCUAJ DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.2.0.VWZMXQE DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.3.0.GSVZQKE DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.4.0.UIUVAZD DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.5.0.ELBXBGB DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
我需要删除前面的8个字符DOWNLOAD</DIV>
,最终看起来像这样。
<DIV>SOFTWARE V1.0.1.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.2.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.3.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.4.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.5.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
是否可以使用 sed 或 awk 命令来完成此操作?
提前感谢任何帮助!
答案1
一个非常简单的方法如下:
$ sed 's,.........DOWNLOAD</DIV>, DOWNLOAD</DIV>,g' input.txt
<DIV>SOFTWARE V1.0.1.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.2.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.3.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.4.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.5.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
将前面的9个字符替换DOWNLOAD</DIV>
为DOWNLOAD</DIV>
答案2
对于必须启用 ERE 的 sed -E
(例如 GNU sed 和 BSD/OSX sed):
sed -E 's:.{8}( DOWNLOAD</DIV>):\1:' file
否则对于任何 POSIX sed:
sed 's:.\{8\}\( DOWNLOAD</DIV>\):\1:' file
例如
$ sed -E 's:.{8}( DOWNLOAD</DIV>):\1:' file
<DIV>SOFTWARE V1.0.1.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.2.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.3.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.4.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.5.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
$ sed 's:.\{8\}\( DOWNLOAD</DIV>\):\1:' file
<DIV>SOFTWARE V1.0.1.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.2.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.3.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.4.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.5.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
答案3
你可以试试这个:
sed 's#SOFTWARE \(.*\)\.[A-Z]\{7\} DOWNLOAD#SOFTWARE \1 DOWNLOAD#' file
答案4
使用 Raku(née Perl6)
~$ raku -pe 's/ <(. ** 8)> <?before " DOWNLOAD</DIV>" $$ > //;' download.txt
<DIV>SOFTWARE V1.0.1.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.2.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.3.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.4.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
<DIV>SOFTWARE V1.0.5.0 DOWNLOAD</DIV>
<DIV>...</DIV>
<DIV>...</DIV>
上面的代码将-pe
自动打印标志与s///
替换运算符结合使用。在运算符的左半部分内,s///
使用零宽度先行断言来查找DOWNLOAD</DIV>
结束标记,并且前面的 8 个字符被<(. ** 8)>
代码精确捕获(并删除)。
HTH。