考虑到代码块是多行的并且 START-END 块中可能出现空行,如何非贪婪地删除以using开头/** START */
和结尾的代码块?/** END */
sed
START 标记单行注释
输入:
class MyClass {
keepField;
/** START */
deleteField;
/** END */
construct() {
/** START */
this.deleteField = 'delete';
/** END */
this.keepField = 'keep';
/** START */
this.deleteFunc();
/** END */
}
/** START */
deleteFunc() {
this.keepField = 'delete';
if (true) {
console.debug('Line before if statement is empty.');
}
} /** END */
}
输出:
class MyClass {
keepField;
construct() {
this.keepField = 'keep';
}
}
我已经尝试了以下内容,sed '/./{H;$!d} ; x ; s/START.*END//' MyClass.js
如sed 手册 > 多行技术部分。
但是,当没有空行和多个 START-END 块(如在 中constructor
)并且 START-END 块内不考虑空代码行(如在函数中deleteFunc
)时,上述命令在块中是贪婪的。
知道如何使用sed
或任何其他命令行工具(例如)来解决上述问题awk
吗?
START 标记是块注释
输入:
class MyClass {
/**
* same code as above only this time the START block is
* multiline like below.
*/
/**
* START
*/
deleteFunc() {
this.keepField = 'delete';
if (true) {
console.debug('Line before if statement is empty.');
}
} /** END */
}
输出还应该是:
class MyClass {
keepField;
construct() {
this.keepField = 'keep';
}
}
答案1
sed 是执行简单 s/old/new/ 操作的出色工具。对于其他任何事情,只需使用 awk 来实现清晰度、效率、稳健性、可移植性、可维护性等。例如,使用任何 POSIX awk:
$ cat tst.awk
{ rec = rec $0 ORS }
END {
while ( match(rec,/\/\*\*[[:space:]*]*END[[:space:]*]*\*\//) ) {
toEnd = substr(rec,1,RSTART+RLENGTH-1)
sub(/(\n[[:blank:]]*)?\/\*\*[[:space:]*]*START[[:space:]*]*\*\/.*/,"",toEnd)
printf "%s", toEnd
rec = substr(rec,RSTART+RLENGTH)
}
printf "%s", rec
}
$ awk -f tst.awk file
class MyClass {
keepField;
construct() {
this.keepField = 'keep';
}
}
class MyClass {
/**
* same code as above only this time the START block is
* multiline like below.
*/
}
如果您没有 POSIX awk,则将每个[:space:]
to\t\n
和[:blank:]
to更改为\t
(每个字符串的第一个字符是文字空白字符),然后它将在任何 awk 中工作。
上面的代码是在此输入文件上运行的:
$ cat file
class MyClass {
keepField;
/** START */
deleteField;
/** END */
construct() {
/** START */
this.deleteField = 'delete';
/** END */
this.keepField = 'keep';
/** START */
this.deleteFunc();
/** END */
}
/** START */
deleteFunc() {
this.keepField = 'delete';
if (true) {
console.debug('Line before if statement is empty.');
}
} /** END */
}
class MyClass {
/**
* same code as above only this time the START block is
* multiline like below.
*/
/**
* START
*/
deleteFunc() {
this.keepField = 'delete';
if (true) {
console.debug('Line before if statement is empty.');
}
} /** END */
}
但还要考虑这种病态的情况,其中整个输入都在一行上:
$ cat file
class MyClass { keepField; /** START */ deleteField; /** END */ construct() { /** START */ this.deleteField = 'delete'; /** END */ this.keepField = 'keep'; /** START */ this.deleteFunc(); /** END */ } /** START */ deleteFunc() { this.keepField = 'delete'; if (true) { console.debug('Line before if statement is empty.'); } } /** END */ }
并请注意,上面的脚本可以正确处理它(因为我可以想象它也会处理许多其他未声明的情况,除非您的开始/结束字符串可能位于文字字符串内或它们本身位于注释内 - 您无法使用模式匹配来处理此类情况,因为我们正在做):
$ awk -f tst.awk file
class MyClass { keepField; construct() { this.keepField = 'keep'; } }
答案2
使用GNUsed
$ sed -Ez 's~ +/?\*+ START( \*)?([^*]*\*+)([^\n]*\n[^*]*\*+)? END[^\n]*\n~~g' input_file
class MyClass {
keepField;
construct() {
this.keepField = 'keep';
}
}
class MyClass {
/**
* same code as above only this time the START block is
* multiline like below.
*/
/**
}