使用 sed 非贪婪删除多行块

使用 sed 非贪婪删除多行块

考虑到代码块是多行的并且 START-END 块中可能出现空行,如何非贪婪地删除以using开头/** START */和结尾的代码块?/** END */sed

START 标记单行注释

解决方案

输入:

class MyClass {
    keepField;
    /** START */
    deleteField;
    /** END */

    construct() {
        /** START */
        this.deleteField = 'delete';
        /** END */
        this.keepField = 'keep';
        /** START */
        this.deleteFunc();
        /** END */
    }
    
    /** START */
    deleteFunc() {
        this.keepField = 'delete';

        if (true) {
            console.debug('Line before if statement is empty.');
        }
    } /** END */
}

输出:

class MyClass {
    keepField;

    construct() {
        this.keepField = 'keep';
    }
    
}

我已经尝试了以下内容,sed '/./{H;$!d} ; x ; s/START.*END//' MyClass.jssed 手册 > 多行技术部分

但是,当没有空行和多个 START-END 块(如在 中constructor)并且 START-END 块内不考虑空代码行(如在函数中deleteFunc)时,上述命令在块中是贪婪的。

知道如何使用sed或任何其他命令行工具(例如)来解决上述问题awk吗?

START 标记是块注释

解决方案

输入:

class MyClass {
    /**
     * same code as above only this time the START block is 
     * multiline like below.
     */

    /**
     * START
     */
    deleteFunc() {
        this.keepField = 'delete';

        if (true) {
            console.debug('Line before if statement is empty.');
        }
    } /** END */
}

输出还应该是:

class MyClass {
    keepField;

    construct() {
        this.keepField = 'keep';
    }

}

答案1

sed 是执行简单 s/old/new/ 操作的出色工具。对于其他任何事情,只需使用 awk 来实现清晰度、效率、稳健性、可移植性、可维护性等。例如,使用任何 POSIX awk:

$ cat tst.awk
{ rec = rec $0 ORS }
END {
    while ( match(rec,/\/\*\*[[:space:]*]*END[[:space:]*]*\*\//) ) {
        toEnd = substr(rec,1,RSTART+RLENGTH-1)
        sub(/(\n[[:blank:]]*)?\/\*\*[[:space:]*]*START[[:space:]*]*\*\/.*/,"",toEnd)
        printf "%s", toEnd
        rec = substr(rec,RSTART+RLENGTH)
    }
    printf "%s", rec
}

$ awk -f tst.awk file
class MyClass {
    keepField;

    construct() {
        this.keepField = 'keep';
    }

}

class MyClass {
    /**
     * same code as above only this time the START block is
     * multiline like below.
     */

}

如果您没有 POSIX awk,则将每个[:space:]to\t\n[:blank:]to更改为\t(每个字符串的第一个字符是文字​​空白字符),然后它将在任何 awk 中工作。

上面的代码是在此输入文件上运行的:

$ cat file
class MyClass {
    keepField;
    /** START */
    deleteField;
    /** END */

    construct() {
        /** START */
        this.deleteField = 'delete';
        /** END */
        this.keepField = 'keep';
        /** START */
        this.deleteFunc();
        /** END */
    }

    /** START */
    deleteFunc() {
        this.keepField = 'delete';

        if (true) {
            console.debug('Line before if statement is empty.');
        }
    } /** END */
}

class MyClass {
    /**
     * same code as above only this time the START block is
     * multiline like below.
     */

    /**
     * START
     */
    deleteFunc() {
        this.keepField = 'delete';

        if (true) {
            console.debug('Line before if statement is empty.');
        }
    } /** END */
}

但还要考虑这种病态的情况,其中整个输入都在一行上:

$ cat file
class MyClass { keepField; /** START */ deleteField; /** END */ construct() { /** START */ this.deleteField = 'delete'; /** END */ this.keepField = 'keep'; /** START */ this.deleteFunc(); /** END */ } /** START */ deleteFunc() { this.keepField = 'delete'; if (true) { console.debug('Line before if statement is empty.'); } } /** END */ }

并请注意,上面的脚本可以正确处理它(因为我可以想象它也会处理许多其他未声明的情况,除非您的开始/结束字符串可能位于文字字符串内或它们本身位于注释内 - 您无法使用模式匹配来处理此类情况,因为我们正在做):

$ awk -f tst.awk file
class MyClass { keepField;  construct() {  this.keepField = 'keep';  }  }

答案2

使用GNUsed

$ sed -Ez 's~ +/?\*+ START( \*)?([^*]*\*+)([^\n]*\n[^*]*\*+)? END[^\n]*\n~~g' input_file
class MyClass {
    keepField;

    construct() {
        this.keepField = 'keep';
    }
    
}
class MyClass {
    /**
     * same code as above only this time the START block is 
     * multiline like below.
     */

    /**
}

相关内容