Sed(或其他)脚本用于替换捕获组内的字符

Sed(或其他)脚本用于替换捕获组内的字符

我正在尝试将 Pandoc 标记转换为 Confluence wiki 标记,我正在使用markdown2confluence来完成大部分工作。除了我谈论的 CSS 和 FreeMarker在代码中使用{&而 Confluence 使用&来标记代码块的开始/结束之外,这种方法效果很好。所以我需要匹配一个包含在 中的模式。}{{}}{{...}}

如果我了解更多 Ruby,我可能会在那里修复它,但我是一个老派 Unix 人,所以我想到了 awk 或 sed。

因此我得到了:

   sed 's/{{\([^}}]*\)}}/{{"\1"}}/g' tmp.wkd

需要:

First we need a way to select a state (or group of states) CSS uses what
is called a selector to choose which elements to apply to, we have been
using one up until now without noticing, it is the {{*}} at the beginning
of our CSS. This is a special selector that means select everything. So
the rule that follows it (the bit between {{{}} and {{}}} apply to every
polygon on the map. But CSS allows us to insert a filter instead by
using {{[...]}} instead of {{*}}.

并产生:

First we need a way to select a state (or group of states) CSS uses what
is called a selector to choose which elements to apply to, we have been
using one up until now without noticing, it is the {{"*"}} at the beginning
of our CSS. This is a special selector that means select everything. So
the rule that follows it (the bit between {{"{"}} and {{""}}} apply to every
polygon on the map. But CSS allows us to insert a filter instead by
using {{"[...]"}} instead of {{"*"}}.

但我需要的是:

First we need a way to select a state (or group of states) CSS uses what
is called a selector to choose which elements to apply to, we have been
using one up until now without noticing, it is the {{*}} at the beginning
of our CSS. This is a special selector that means select everything. So
the rule that follows it (the bit between {{\{}} and {{\}}} apply to every
polygon on the map. But CSS allows us to insert a filter instead by
using {{[...]}} instead of {{*}}.

还需要处理{{${type.name}}}应该成为的{{$\{type.name\}}}

有两个问题

  1. 我需要{用 来替换\{,而不是使用引号,所以我需要以\1某种方式进行修改。
  2. 无论我如何尝试结束模式匹配,看起来令人讨厌{{}}}(应该是出现)并没有正确出现。{{\}}}

答案1

以下 sed 命令似乎有效:

   sed 's/{{\([^*[a-z][^}]*\)}}/{{\\\1}}/g;s/{{\\${\([^}]*\)}}}/{{$\\{\1\\}}}/g'

解释:

  1. {{\([^*[a-z][^}]*\)}}查找{{stuff}},除非stuff*[或小写字母开头。
  2. 将其替换为{{\stuff}}
  3. 然后{{\\${\([^}]*\)}}}发现{{\${junk}}}
  4. 并将其替换为{{$\{junk\}}}

编辑:在 OP 澄清之后,另一种解决方案可能是:

   sed 's/\({{[^}]*\){\([^}]*}}\)/\1\\{\2/g;s/\({{[^}]*\)}}}/\1\\}}}/g'

众所周知,sed 不能进行递归解析,但这对于大多数简单情况应该有效。

解释:

  1. \({{[^}]*\){\([^}]*}}\)发现{{foo{bar}},其中foobar不包含}
  2. 并将其替换为{{foo\{bar}}。(注释{{xxx{yyy}}}处理正确。)
  3. 然后\({{[^}]*\)}}}发现{{baz}}},其中baz不包含}
  4. 并将其替换为{{baz\}}}

foobarbaz可以为空,因此例如根据需要{{}}}转换为{{\}}}

相关内容