如何删除文件中两个标记之间的所有内容？

Question 1

和perl：

perl -0777 -pe 's/\Q\{{[}\E.*?\Q{]}\}\E//gs'

请注意，整个输入在处理之前已加载到内存中。

\Qsomething\E被something视为文字字符串而不是正则表达式。

要就地修改常规文件，请添加以下-i选项：

perl -0777 -i -pe 's/\Q\{{[}\E.*?\Q{]}\}\E//gs' file.txt

使用 GNUawk或mawk：

awk -v 'RS=\\\\\\{\\{\\[}|\\{\\]}\\\\}' -v ORS= NR%2

在那里，我们定义了记录分隔符作为这些开始或结束标记之一（仅gawk并mawk支持RS此处为正则表达式）。但是我们需要再次转义正则表达式运算符（反斜杠{，，）的字符[以及反斜杠，因为它在-v（用于诸如\n，\b...）的参数中很特殊，因此有大量的反斜杠。

然后我们需要做的就是打印所有其他记录。对于每个奇数记录NR%2都是（true）。1

对于这两种解决方案，我们假设标记是匹配的并且这些部分没有嵌套。

要使用最新版本的 GNU 就地修改文件，awk请添加-i /usr/share/awk/inplace.awk1 选项。

^{^不使用-i inplaceas尝试首先从当前工作目录gawk加载inplace扩展（asinplace或），有人可能已经在其中植入了恶意软件。随系统提供的扩展inplace.awk的路径可能会有所不同，请参阅输出inplacegawkgawk 'BEGIN{print ENVIRON["AWKPATH"]}'}

Answer

和perl：

perl -0777 -pe 's/\Q\{{[}\E.*?\Q{]}\}\E//gs'

请注意，整个输入在处理之前已加载到内存中。

\Qsomething\E被something视为文字字符串而不是正则表达式。

要就地修改常规文件，请添加以下-i选项：

perl -0777 -i -pe 's/\Q\{{[}\E.*?\Q{]}\}\E//gs' file.txt

使用 GNUawk或mawk：

awk -v 'RS=\\\\\\{\\{\\[}|\\{\\]}\\\\}' -v ORS= NR%2

在那里，我们定义了记录分隔符作为这些开始或结束标记之一（仅gawk并mawk支持RS此处为正则表达式）。但是我们需要再次转义正则表达式运算符（反斜杠{，，）的字符[以及反斜杠，因为它在-v（用于诸如\n，\b...）的参数中很特殊，因此有大量的反斜杠。

然后我们需要做的就是打印所有其他记录。对于每个奇数记录NR%2都是（true）。1

对于这两种解决方案，我们假设标记是匹配的并且这些部分没有嵌套。

要使用最新版本的 GNU 就地修改文件，awk请添加-i /usr/share/awk/inplace.awk1 选项。

^{^不使用-i inplaceas尝试首先从当前工作目录gawk加载inplace扩展（asinplace或），有人可能已经在其中植入了恶意软件。随系统提供的扩展inplace.awk的路径可能会有所不同，请参阅输出inplacegawkgawk 'BEGIN{print ENVIRON["AWKPATH"]}'}

Question 2

sed   -e:t -e'y/\n/ /;/\\{{\[}/!b'               \
      -e:N -e'/\\{{\[.*{\]}\\}/!N'               \
           -e's/\(\\{{\[}\).*\n/\1/;tN'          \
           -e'y/ /\n/;s/\\{{\[}/& /;ts'          \
      -e:s -e's/\(\[} [^ ]*\)\({\]}\\}\)/\1 \2/' \
      -ets -e's/..... [^ ]* .....//;s/ //g;bt'   \
<<""
#Bla Bla {]}\} bla bla \{{[} more bla bla
#even more bla bla bla bla. \{{[} 
#
#A lot of stuff might be here.
#hashes are for stupid syntax color only
#Bla bla {]}\} finally {]}\} done.
#
#Nonetheless, the \{{[} show {]}\} goes \{{[} show {]}\} on.

#Bla Bla {]}\} bla bla  finally {]}\} done.
#
#Nonetheless, the  goes  on.

不过，这里有一个更好的方法。替换次数要少得多，而且每次替换的都是几个角色，而不是.*一直替换。实际上，唯一.*使用的时间是当第一个出现的开始与第一个后续结束肯定配对时清除中间空间的模式空间。其余时间sed只需D删除尽可能多的内容即可到达下一个出现的分隔符。唐教我的。

sed -etD -e:t -e'/\\{{\[}/!b'  \
    -e's//\n /;h;D'       -e:D \
    -e'/^}/{H;x;s/\n.*\n.//;}' \
    -ett    -e's/{\]}\\}/\n}/' \
    -e'/\n/!{$!N;s//& /;}' -eD \
<<""
#Bla Bla {]}\} bla bla \{{[} more bla bla
#even more bla bla bla bla. \{{[} 
#
#A lot of stuff might be here.
#hashes are for stupid syntax color only
#Bla bla {]}\} finally {]}\} done.
#
#Nonetheless, the \{{[} show {]}\} goes \{{[} show {]}\} on.

#Bla Bla {]}\} bla bla  finally {]}\} done.
#
#Nonetheless, the  goes  on.

不过， RHS\n换行符可能需要替换为文字反斜杠转义换行符。

这是一个更通用的版本：

#!/usr/bin/sed -f
####replace everything between START and END
   #branch to :Kil if a successful substitution
   #has already occurred. this can only happen
   #if pattern space has been Deleted earlier
    t Kil
   #set a Ret :label so we can come back here
   #when we've cleared a START -> END occurrence
   #and check for another if need be
    :Ret
   #if no START, don't
    /START/!b
   #sigh. there is one. get to work. replace it
   #with a newline followed by an S and save
   #a copy then Delete up to our S marker.
    s||\
S|
    h;D
   #set the :Kil label. we'll come back here from now
   #on until we've definitely got END at the head of
   #pattern space.
    :Kil
   #do we? 
    /^E/{
       #if so, we'll append it to our earlier save
       #and slice out everything between the two newlines
       #we've managed to insert at just the right points        
        H;x
        s|\nS.*\nE||
    }
   #if we did just clear START -> END we should
   #branch back to :Ret and look for another START
    t Ret
   #pattern space didnt start w/ END, but is there even
   #one at all? if so replace it w/ a newline followed
   #by an E so we'll recognize it at the next :Kil
    s|END|\
E|
   #if that last was successful we'll have a newline
   #but if not it means we need to get the next line
   #if the last line we've got unmatched pairs and are
   #currently in a delete cycle anyway, but maybe we
   #should print up to our START marker in that case?
    /\n/!{
       #i guess so. now that i'm thinking about it
       #we'll swap into hold space, and Print it
        ${  x;P;d
        }
       #get next input line and add S after the delimiting
       #newline because we're still in START state. Delete
       #will handle everything up to our marker before we
       #branch back to :Kil at the top of the script
        N
        s||&S|
    }
   #now Delete will slice everything from head of pattern space
   #to the first occurring newline and loop back to top of script.
   #because we've definitely made successful substitutions if we
   #have a newline at all we'll test true and branch to :Kil 
   #to go again until we've definitely got ^E
    D

...没有评论...

#!/usr/bin/sed -f
    t Kil
    :Ret
    /START/!b
    s||\
S|
    h;D
    :Kil
    /^E/{
        H;x
        s|\nS.*\nE||
    }
    t Ret
    s|END|\
E|
    /\n/!{
        ${  x;P;d
        }
        N
        s||&S|
    }
    D

我将注释版本复制到剪贴板并执行以下操作：

{ xsel; echo; } >se.sed
chmod +x se.sed
./se.sed <se.sed

#!/usr/bin/sed -f
####replace everything between
   #branch to :Kil if a successful substitution
   #has already occurred. this can only happen
   #if pattern space has been Deleted earlier
    t Kil
   #set a Ret :label so we can come back here
   #when we've cleared a  occurrence
   #and check for another if need be
    :Ret
   #if no  at the head of
   #pattern space.
    :Kil
   #do we?
    /^E/{
       #if so, we'll append it to our earlier save
       #and slice out everything between the two newlines
       #we've managed to insert at just the right points
        H;x
        s|\nS.*\nE||
    }
   #if we did just clear  we should
   #branch back to :Ret and look for another , but is there even
   #one at all? if so replace it w/ a newline followed
   #by an E so we'll recognize it at the next :Kil
    s|END|\
E|
   #if that last was successful we'll have a newline
   #but if not it means we need to get the next line
   #if the last line we've got unmatched pairs and are
   #currently in a delete cycle anyway, but maybe we
   #should print up to our

Answer

sed   -e:t -e'y/\n/ /;/\\{{\[}/!b'               \
      -e:N -e'/\\{{\[.*{\]}\\}/!N'               \
           -e's/\(\\{{\[}\).*\n/\1/;tN'          \
           -e'y/ /\n/;s/\\{{\[}/& /;ts'          \
      -e:s -e's/\(\[} [^ ]*\)\({\]}\\}\)/\1 \2/' \
      -ets -e's/..... [^ ]* .....//;s/ //g;bt'   \
<<""
#Bla Bla {]}\} bla bla \{{[} more bla bla
#even more bla bla bla bla. \{{[} 
#
#A lot of stuff might be here.
#hashes are for stupid syntax color only
#Bla bla {]}\} finally {]}\} done.
#
#Nonetheless, the \{{[} show {]}\} goes \{{[} show {]}\} on.

#Bla Bla {]}\} bla bla  finally {]}\} done.
#
#Nonetheless, the  goes  on.

不过，这里有一个更好的方法。替换次数要少得多，而且每次替换的都是几个角色，而不是.*一直替换。实际上，唯一.*使用的时间是当第一个出现的开始与第一个后续结束肯定配对时清除中间空间的模式空间。其余时间sed只需D删除尽可能多的内容即可到达下一个出现的分隔符。唐教我的。

sed -etD -e:t -e'/\\{{\[}/!b'  \
    -e's//\n /;h;D'       -e:D \
    -e'/^}/{H;x;s/\n.*\n.//;}' \
    -ett    -e's/{\]}\\}/\n}/' \
    -e'/\n/!{$!N;s//& /;}' -eD \
<<""
#Bla Bla {]}\} bla bla \{{[} more bla bla
#even more bla bla bla bla. \{{[} 
#
#A lot of stuff might be here.
#hashes are for stupid syntax color only
#Bla bla {]}\} finally {]}\} done.
#
#Nonetheless, the \{{[} show {]}\} goes \{{[} show {]}\} on.

#Bla Bla {]}\} bla bla  finally {]}\} done.
#
#Nonetheless, the  goes  on.

不过， RHS\n换行符可能需要替换为文字反斜杠转义换行符。

这是一个更通用的版本：

#!/usr/bin/sed -f
####replace everything between START and END
   #branch to :Kil if a successful substitution
   #has already occurred. this can only happen
   #if pattern space has been Deleted earlier
    t Kil
   #set a Ret :label so we can come back here
   #when we've cleared a START -> END occurrence
   #and check for another if need be
    :Ret
   #if no START, don't
    /START/!b
   #sigh. there is one. get to work. replace it
   #with a newline followed by an S and save
   #a copy then Delete up to our S marker.
    s||\
S|
    h;D
   #set the :Kil label. we'll come back here from now
   #on until we've definitely got END at the head of
   #pattern space.
    :Kil
   #do we? 
    /^E/{
       #if so, we'll append it to our earlier save
       #and slice out everything between the two newlines
       #we've managed to insert at just the right points        
        H;x
        s|\nS.*\nE||
    }
   #if we did just clear START -> END we should
   #branch back to :Ret and look for another START
    t Ret
   #pattern space didnt start w/ END, but is there even
   #one at all? if so replace it w/ a newline followed
   #by an E so we'll recognize it at the next :Kil
    s|END|\
E|
   #if that last was successful we'll have a newline
   #but if not it means we need to get the next line
   #if the last line we've got unmatched pairs and are
   #currently in a delete cycle anyway, but maybe we
   #should print up to our START marker in that case?
    /\n/!{
       #i guess so. now that i'm thinking about it
       #we'll swap into hold space, and Print it
        ${  x;P;d
        }
       #get next input line and add S after the delimiting
       #newline because we're still in START state. Delete
       #will handle everything up to our marker before we
       #branch back to :Kil at the top of the script
        N
        s||&S|
    }
   #now Delete will slice everything from head of pattern space
   #to the first occurring newline and loop back to top of script.
   #because we've definitely made successful substitutions if we
   #have a newline at all we'll test true and branch to :Kil 
   #to go again until we've definitely got ^E
    D

...没有评论...

#!/usr/bin/sed -f
    t Kil
    :Ret
    /START/!b
    s||\
S|
    h;D
    :Kil
    /^E/{
        H;x
        s|\nS.*\nE||
    }
    t Ret
    s|END|\
E|
    /\n/!{
        ${  x;P;d
        }
        N
        s||&S|
    }
    D

我将注释版本复制到剪贴板并执行以下操作：

{ xsel; echo; } >se.sed
chmod +x se.sed
./se.sed <se.sed

#!/usr/bin/sed -f
####replace everything between
   #branch to :Kil if a successful substitution
   #has already occurred. this can only happen
   #if pattern space has been Deleted earlier
    t Kil
   #set a Ret :label so we can come back here
   #when we've cleared a  occurrence
   #and check for another if need be
    :Ret
   #if no  at the head of
   #pattern space.
    :Kil
   #do we?
    /^E/{
       #if so, we'll append it to our earlier save
       #and slice out everything between the two newlines
       #we've managed to insert at just the right points
        H;x
        s|\nS.*\nE||
    }
   #if we did just clear  we should
   #branch back to :Ret and look for another , but is there even
   #one at all? if so replace it w/ a newline followed
   #by an E so we'll recognize it at the next :Kil
    s|END|\
E|
   #if that last was successful we'll have a newline
   #but if not it means we need to get the next line
   #if the last line we've got unmatched pairs and are
   #currently in a delete cycle anyway, but maybe we
   #should print up to our

Question 3

如果您的文件是 test.txt 您可以使用：

sed ':a;N;$!ba;s/\n/ /g' test.txt|sed 's/\\{{\[}.*{\]}\\}//'

第一个 sed 删除所有换行符，第二个删除标签内的文本。

我不知道您是否需要更通用的解决方案

Answer

如果您的文件是 test.txt 您可以使用：

sed ':a;N;$!ba;s/\n/ /g' test.txt|sed 's/\\{{\[}.*{\]}\\}//'

第一个 sed 删除所有换行符，第二个删除标签内的文本。

我不知道您是否需要更通用的解决方案

如何删除文件中两个标记之间的所有内容？

答案1

答案2

答案3

相关内容