使用 sed 中的表或脚本将许多特殊字符替换为转义字符?

使用 sed 中的表或脚本将许多特殊字符替换为转义字符?

如果您想使用 sed 替换特殊字符,您可以使用不同的方法,但问题是您必须在许多文件中用转义字符替换许多(100+)特殊字符。

因此需要:(感谢 Peter)

^^逃跑 单身^
^|逃跑|
\& 逃跑&
\/逃跑/
\\​ 逃跑\

假设许多文件中有 100 多个字符串示例:

sed.exe -i "s/{\*)(//123/
sed -i "s/\\/123/g;" 1.txt
sed.exe -i "s/{\*)(//123/
sed -i "s/\\/123/g;" 1.txt
.....
.....

这些字符串包含许多需要转义的特殊字符(我们有 100 多个字符串)。
手动转义是一项非常漫长的工作。所以我需要创建一个类似于替换在命令提示符中调用以转义特殊字符,然后用我的单词替换它们。
我该怎么做?

答案1

请注意,^^对于^^|对于|、以及^&对于&...不是的必要条件sed^ 转义符是 CMD-shell 所必需的。如果您的文本既不暴露给命令行,也不暴露给 .cmd/.bat 命令脚本中的命令参数,则只需考虑sed 的转义字符是一个反斜杠\​...它们是两个相当独立的范围(可以重叠,因此最好将它们全部保留在 sed 的范围内,如下所示。

这是一个sed脚本,它将替换任意数量的查找字符串您指定,及其互补替换字符串。字符串的一般格式是替换sed命令 (s/abc/xyz/p) 和表格格式。您可以“拉伸”中间分隔符,以便排列内容。
您可以使用 FIXED 字符串模式 (F/...)或正常sed 风格正则表达式模式(隨著...)... 您可以根据需要 调整sed -n每个(在 table.txt 中)。/p

您需要 3 个文件才能进行最小运行(第 4 个文件从 table.txt 动态派生):

  1. 主脚本   表格转正则表达式.sed
  2. 表格文件       表格.txt
  3. 目标文件     要更改的文件.txt
  4. 派生脚本    表派生.sed

针对一个目标文件运行一个表。

sed -nf table-to-regex.sed  table.txt > table-derrived.sed
# Here, check `table-derrived.sed` for errors as described in the example *table.txt*.  

sed -nf table-derrived.sed  file-to-change.txt
# Redirect *sed's* output via `>` or `>>` as need be, or use `sed -i -nf` 

如果你想跑表格.txt针对许多文件,只需将上面的代码片段放入一个简单的循环中即可满足您的要求。我可以在狂欢,但比我更了解 Windows CMD-shell 的人会更适合进行该设置。


脚本如下:表格转正则表达式.sed

s/[[:space:]]*$//  # remove trailing whitespace

/^$\|^[[:space:]]*#/{p; b}  # empty and sed-style comment lines: print and branch
                            # printing keeps line numbers; for referencing errors

/^\([Fs]\)\(.\)\(.*\2\)\{4\}/{  # too many delims ERROR
      s/^/# error + # /p        # print a flagged/commented error
      b }                       # branch

/^\([Fs]\)\(.\)\(.*\2\)\{3\}/{                  # this may be a long-form 2nd delimiter
   /^\([Fs]\)\(.\)\(.*\2[[:space:]]*\2.*\2\)/{  # is long-form 2nd delimiter OK?
      s/^\([Fs]\)\(.\)\(.*\)\2[[:space:]]*\2\(.*\)\2\(.*\)/\1\2\n\3\n\4\n\5/
      t OK                                      # branch on true to :OK
   }; s/^/# error L # /p                        # print a flagged/commented error
      b }                                       # branch: long-form 2nd delimiter ERROR

/^\([Fs]\)\(.\)\(.*\2\)\{2\}/{     # this may be short-form delimiters
   /^\([Fs]\)\(.\)\(.*\2.*\2\)/{   # is short-form delimiters OK?
      s/^\([Fs]\)\(.\)\(.*\)\2\(.*\)\2\(.*\)/\1\2\n\3\n\4\n\5/
      t OK                         # branch on true to :OK  
   }; s/^/# error S # /p           # print a flagged/commented error
      b }                          # branch: short-form delimiters ERROR

{ s/^/# error - # /p        # print a flagged/commented error
  b }                       # branch: too few delimiters ERROR

:OK     # delimiters are okay
#============================
h   # copy the pattern-space to the hold space

# NOTE: /^s/ lines are considered to contain regex patterns, not FIXED strings.
/^s/{    s/^s\(.\)\n/s\1/   # shrink long-form delimiter to short-form
     :s; s/^s\(.\)\([^\n]*\)\n/s\1\2\1/; t s  # branch on true to :s 
      p; b }                                  # print and branch

# The following code handles FIXED-string /^F/ lines

s/^F.\n\([^\n]*\)\n.*/\1/  # isolate the literal find-string in the pattern-space
s/[]\/$*.^|[]/\\&/g        # convert the literal find-string into a regex of itself
H                          # append \n + find-regex to the hold-space

g   # Copy the modified hold-space back into the pattern-space

s/^F.\n[^\n]*\n\([^\n]*\)\n.*/\1/  # isolate the literal repl-string in the pattern-space
s/[\/&]/\\&/g                      # convert the literal repl-string into a regex of itself
H                                  # append \n + repl-regex to the hold-space

g   # Copy the modified hold-space back into the pattern-space

# Rearrange pattern-space into a / delimited command: s/find/repl/...      
s/^\(F.\)\n\([^\n]*\)\n\([^\n]*\)\n\([^\n]*\)\n\([^\n]*\)\n\([^\n]*\)$/s\/\5\/\6\/\4/

p   # Print the modified find-and-replace regular expression line

这是一个示例表文件,其中描述了它的工作原理:表格.txt

# The script expects an input table file, which can contain 
#   comment, blank, and substitution lines. The text you are
#   now reading is part of an input table file.

# Comment lines begin with optional whitespace followed by #

# Each substitution line must start with: 's' or 'F'
#  's' lines are treated as a normal `sed` substitution regular expressions
#  'F' lines are considered to contain `FIXED` (literal) string expressions 
# The 's' or 'F' must be followed by the 1st of 3 delimiters   
#   which must not appear elsewhere on the same line.
# A pre-test is performed to ensure conformity. Lines with 
#   too many or too few delimiters, or no 's' or 'F', are flagged   
#   with the text '# error ? #', which effectively comments them out.
#   '?' can be: '-' too few, '+' too many, 'L' long-form, 'S' short-form
#   Here is an example of a long-form error, as it appears in the output. 

# error L # s/example/(7+3)/2=5/

# 1st delimiter, eg '/' must be a single character.
# 2nd (middle) delimiter has two possible forms:
#   Either it is exactly the same as the 1st delimiter: '/' (short-form)
#   or it has a double-form for column alignment: '/      /' (long-form)
#   The long-form can have any anount of whitespace between the 2 '/'s   
# 3rd delimiter must be the same as the 1st delimiter,

# After the 3rd delimiter, you can put any of sed's 
#    substitution commands, eg. 'g'

# With one condition, a trailing '#' comment to 's' and 'F' lines is
#    valid. The condition is that no delimiter character can be in the 
#    comment (delimiters must not appear elsewhere on the same line)

# For 's' type lines, it is implied that *you* have included all the 
#    necessary sed-escape characters!  The script does not add any 
#    sed-escape characters for 's' type lines. It will, however, 
#    convert a long-form middle-delimiter into a short-form delimiter.   

# For 'F' type lines, it is implied that both strings (find and replace) 
#    are FIXED/literal-strings. The script does add the  necessary 
#    sed-escape characters for 'F' type lines. It will also 
#    convert a long-form middle-delimiter into a short-form delimiter.   

# The result is a sed-script which contains one sed-substitution 
#    statement per line; it is just a modified version of your 
#    's' and 'F' strings "table" file.

# Note that the 1st delimiter is *always* in column 2.

# Here are some sample 's' and 'F' lines, with comments:
#

F/abc/ABC/gp               #-> These 3 are the same for 's' and 'F', 
s/abc/ABC/gp               #-> as no characters need to be escaped,  
s/abc/         /ABC/gp     #-> and the 2nd delimiter shrinks to one  

F/^F=Fixed/    /\1okay/p   # \1 is okay here, It is a FIXED literal
s|^s=sed regex||\1FAIL|p   # \1 will FAIL: back-reference not defined!

F|\\\\|////|               # this line == next line 
F|\\\\|        |////|p     # this line == previous line  
s|\\\\|        |////|p     # this line is different; 's' vs 'F'

F_Hello! ^.&`//\\*$/['{'$";"`_    _Ciao!_   # literal find / replace    

以下是您想要更改其文本的示例输入文件:要更改的文件.txt

abc abc
^F=Fixed
   s=sed regex
\\\\ \\\\ \\\\ \\\\
Hello! ^.&`//\\*$/['{'$";"`
some non-matching text

相关内容