使用 sed 非贪婪删除多行块

Question 1

sed 是执行简单 s/old/new/ 操作的出色工具。对于其他任何事情，只需使用 awk 来实现清晰度、效率、稳健性、可移植性、可维护性等。例如，使用任何 POSIX awk：

$ cat tst.awk
{ rec = rec $0 ORS }
END {
    while ( match(rec,/\/\*\*[[:space:]*]*END[[:space:]*]*\*\//) ) {
        toEnd = substr(rec,1,RSTART+RLENGTH-1)
        sub(/(\n[[:blank:]]*)?\/\*\*[[:space:]*]*START[[:space:]*]*\*\/.*/,"",toEnd)
        printf "%s", toEnd
        rec = substr(rec,RSTART+RLENGTH)
    }
    printf "%s", rec
}

$ awk -f tst.awk file
class MyClass {
    keepField;

    construct() {
        this.keepField = 'keep';
    }

}

class MyClass {
    /**
     * same code as above only this time the START block is
     * multiline like below.
     */

}

如果您没有 POSIX awk，则将每个[:space:]to\t\n和[:blank:]to更改为\t（每个字符串的第一个字符是文字空白字符），然后它将在任何 awk 中工作。

上面的代码是在此输入文件上运行的：

$ cat file
class MyClass {
    keepField;
    /** START */
    deleteField;
    /** END */

    construct() {
        /** START */
        this.deleteField = 'delete';
        /** END */
        this.keepField = 'keep';
        /** START */
        this.deleteFunc();
        /** END */
    }

    /** START */
    deleteFunc() {
        this.keepField = 'delete';

        if (true) {
            console.debug('Line before if statement is empty.');
        }
    } /** END */
}

class MyClass {
    /**
     * same code as above only this time the START block is
     * multiline like below.
     */

    /**
     * START
     */
    deleteFunc() {
        this.keepField = 'delete';

        if (true) {
            console.debug('Line before if statement is empty.');
        }
    } /** END */
}

但还要考虑这种病态的情况，其中整个输入都在一行上：

$ cat file
class MyClass { keepField; /** START */ deleteField; /** END */ construct() { /** START */ this.deleteField = 'delete'; /** END */ this.keepField = 'keep'; /** START */ this.deleteFunc(); /** END */ } /** START */ deleteFunc() { this.keepField = 'delete'; if (true) { console.debug('Line before if statement is empty.'); } } /** END */ }

并请注意，上面的脚本可以正确处理它（因为我可以想象它也会处理许多其他未声明的情况，除非您的开始/结束字符串可能位于文字字符串内或它们本身位于注释内 - 您无法使用模式匹配来处理此类情况，因为我们正在做）：

$ awk -f tst.awk file
class MyClass { keepField;  construct() {  this.keepField = 'keep';  }  }

Answer

sed 是执行简单 s/old/new/ 操作的出色工具。对于其他任何事情，只需使用 awk 来实现清晰度、效率、稳健性、可移植性、可维护性等。例如，使用任何 POSIX awk：

$ cat tst.awk
{ rec = rec $0 ORS }
END {
    while ( match(rec,/\/\*\*[[:space:]*]*END[[:space:]*]*\*\//) ) {
        toEnd = substr(rec,1,RSTART+RLENGTH-1)
        sub(/(\n[[:blank:]]*)?\/\*\*[[:space:]*]*START[[:space:]*]*\*\/.*/,"",toEnd)
        printf "%s", toEnd
        rec = substr(rec,RSTART+RLENGTH)
    }
    printf "%s", rec
}

$ awk -f tst.awk file
class MyClass {
    keepField;

    construct() {
        this.keepField = 'keep';
    }

}

class MyClass {
    /**
     * same code as above only this time the START block is
     * multiline like below.
     */

}

如果您没有 POSIX awk，则将每个[:space:]to\t\n和[:blank:]to更改为\t（每个字符串的第一个字符是文字空白字符），然后它将在任何 awk 中工作。

上面的代码是在此输入文件上运行的：

$ cat file
class MyClass {
    keepField;
    /** START */
    deleteField;
    /** END */

    construct() {
        /** START */
        this.deleteField = 'delete';
        /** END */
        this.keepField = 'keep';
        /** START */
        this.deleteFunc();
        /** END */
    }

    /** START */
    deleteFunc() {
        this.keepField = 'delete';

        if (true) {
            console.debug('Line before if statement is empty.');
        }
    } /** END */
}

class MyClass {
    /**
     * same code as above only this time the START block is
     * multiline like below.
     */

    /**
     * START
     */
    deleteFunc() {
        this.keepField = 'delete';

        if (true) {
            console.debug('Line before if statement is empty.');
        }
    } /** END */
}

但还要考虑这种病态的情况，其中整个输入都在一行上：

$ cat file
class MyClass { keepField; /** START */ deleteField; /** END */ construct() { /** START */ this.deleteField = 'delete'; /** END */ this.keepField = 'keep'; /** START */ this.deleteFunc(); /** END */ } /** START */ deleteFunc() { this.keepField = 'delete'; if (true) { console.debug('Line before if statement is empty.'); } } /** END */ }

并请注意，上面的脚本可以正确处理它（因为我可以想象它也会处理许多其他未声明的情况，除非您的开始/结束字符串可能位于文字字符串内或它们本身位于注释内 - 您无法使用模式匹配来处理此类情况，因为我们正在做）：

$ awk -f tst.awk file
class MyClass { keepField;  construct() {  this.keepField = 'keep';  }  }

Question 2

使用GNUsed

$ sed -Ez 's~ +/?\*+ START( \*)?([^*]*\*+)([^\n]*\n[^*]*\*+)? END[^\n]*\n~~g' input_file

class MyClass {
    keepField;

    construct() {
        this.keepField = 'keep';
    }
    
}

class MyClass {
    /**
     * same code as above only this time the START block is 
     * multiline like below.
     */

    /**
}

Answer

使用GNUsed

$ sed -Ez 's~ +/?\*+ START( \*)?([^*]*\*+)([^\n]*\n[^*]*\*+)? END[^\n]*\n~~g' input_file

class MyClass {
    keepField;

    construct() {
        this.keepField = 'keep';
    }
    
}

class MyClass {
    /**
     * same code as above only this time the START block is 
     * multiline like below.
     */

    /**
}

使用 sed 非贪婪删除多行块

START 标记单行注释

START 标记是块注释

答案1

答案2

相关内容