在纯 bash 中从 PS1 中删除零宽度部分

在纯 bash 中从 PS1 中删除零宽度部分

我正在构建一个打印右调整字符串的函数(即在终端的右侧)。为此,我需要两件事:

  1. 我将打印的字符串,包括所有进行装饰的零宽度部分(颜色等)。
  2. 我将打印的字符串,不包括所有零宽度部分(或更准确地说,它的长度)。

出于显而易见的原因,我需要 [1],并且我需要 [2] 来知道我应该向右偏移多少:terminal width-[2]

为此,我选择了与 PS1/PS2 相同的格式:在\[和之间放置零宽度部分\]

我已经使用非贪婪正则表达式编写了一个函数,perl该函数按预期工作:

# This needs zero-width parts surrounded by \[ and \] just like PS1/PS2
function _print_right_adjusted_perl() {
        local escaped_line printed_line nonzero_line forward

        #line with all zero-width parts escaped by \[ and \]
        escaped_line="${1}"

        # [1]: Only the \[ and \] markers removed; this will be the thing that is actually printed.
        printed_line="$(perl -pe 's|\\\[(.*?)\\\]|\1|g' <<<"${escaped_line}")"

        # [2]: all zero-with parts removed, including the markers \[ and \].
        nonzero_line="$(perl -pe 's|\\\[.*?\\\]||g' <<<"${escaped_line}")"


        # "carriage return" (literally) returns cursor to the first column of this row
        printf "$(tput cr)"

        # tput cuf N: move cursor forward N times
        forward="$(( "$(tput cols)" - "${#nonzero_line}" ))"
        printf "$(tput cuf "${forward}")"

        # print the actual text
        printf "${printed_line}"
}

_print_right_adjusted "\[$(tput setaf 7)\]my coloured thing"

然后我开始写一个纯bash的版本,但是还是有问题。

下面的函数已被我的答案取代。

# This needs zero-width parts surrounded by \[ and \] just like PS1/PS2
function _print_right_adjusted_old() {

        local printed_string="" has_length='true' first='true' added_part=''
        local -i printed_length=0 forward

        # split input string at backslashes. eg. turn:
        #    "normal \[zero-width\]colour string"
        # into:
        #    ( "normal" "[zero-width" "]colour string" )
        IFS='\'
        for part in ${1}; do
                # check what the first character is
                case "${part:0:1}" in
                        '[')
                             # start of a zero-width section
                             has_length='false';
                             # remove the marker
                             part="${part#'['}"
                             ;;
                        ']')
                             # end of a zero-width section
                             has_length='true';
                             # remove the marker
                             part="${part#']'}"
                             ;;
                        # not '\[' or '\]', re-add '\' except for the first segment
                        *) [[ "${first}" != 'true' ]] && part="\\${part}" ;;
                esac
                first='false'

                printed_string+="${part}"

                if [[ "${has_length}" == 'true' ]]; then
                        printed_length+="${#part}"
                fi
        done

        # cr: "carriage return" (literally) returns cursor to the first column of current row
        # cuf N: cursor-forward: move cursor N spaces forward (to the right)
        forward="$(( "$(tput cols)" - "${printed_length}" ))"
        printf "$(tput cr)$(tput cuf "${forward}")"

        # print the actual string
        printf "${printed_string}"
}

我已经了解到${variable@P}比像 PS1 一样解释变量,因此最后一行可以替换为,并且可以删除printf "${1@P}"该行。printed_string+="${part}"

答案1

此后我已经重写了这个脚本两次。

第一个版本工作得很好,但速度较慢,因为它每 1 或 2 个字符循环一次:在length/2length之间。

function _print_right_adjusted_2step() {
        local escaped_string="${1:?'need PS1-style string as 1st argument'}"

        local has_width='y'
        local -i length=0 terminal_width="$(tput cols)"

        # loop over string character by character and check 2 characters starting at ${i}
        for (( i=0; i<"${#escaped_string}"; i++ )); do case "${escaped_string:${i}:2}" in

                # when first character is a backslash, check for \[ or \] combination
                '\[') ((++i)); has_width='' ; ;;
                '\]') ((++i)); has_width='y'; ;;

                # when 2nd character is a backslash we go forward 1 character only so the backslash is first on the next loop iteration
                ?'\')          [[ -n "${has_width}" ]] && length+=1; ;;

                # when 2nd character is not a backslash we can skip ahead
                ?*)   ((++i)); [[ -n "${has_width}" ]] && length+=2; ;;
        esac; done

        local offset="$(( "${terminal_width}" - "${length}" ))"
        printf "$(tput cr)$(tput cuf "${offset}")${escaped_string@P}"
}

第二个版本要快得多,因为它只循环几次: (count of \) + 1

function _print_right_adjusted_new() {
        local escaped_string="${1:?'need PS1-style string as 1st argument'}"

        local prefix='' has_width='y' terminal_width="$(tput cols)" nonzero_string=''
        local -i length=0

        local IFS='\'
        for segment in ${escaped_string}; do
                #  with the 1st segment, prefix is ''  (empty string)
                # after the 1st segment, prefix is '\' (a backslash)
                case "${prefix}${segment:0:1}" in
                        # when first character is a backslash, check for \[ or \] combination
                        '\[') has_width='' ;              ;;
                        # remove the soon-to-be-added ']' from the length
                        '\]') has_width='y'; ((length--)) ;;
                        # add the missing '\' to the length
                        '\'?)                ((length++)) ;;
                esac

                if [[ -n "${has_width}" ]]; then
                    # use the line below if you need the actual content
                    # nonzero_string+="${segment}"
                    length+="${#segment}"
                fi
                prefix='\'
        done

        local offset="$(( "${terminal_width}" - "${length}" ))"
        printf "$(tput cr)$(tput cuf "${offset}")${escaped_string@P}"
}

然后我测试了其中的每一个 10.000 次,没有打印开销,每个版本都是相同的:

#!/bin/bash
set -euo pipefail

function _print_right_adjusted_2step() {
    local print_start_time="${EPOCHREALTIME/./}"

    local has_width='y' escaped_string="${1:?'need PS1-style string as 1st argument'}"

    local -i length=0 terminal_width=0 offset=0

    # loop over string character by character and include the next character in the check ( 2 characters starting at ${i} )
    for (( i=0; i<"${#escaped_string}"; i++ )); do case "${escaped_string:${i}:2}" in

        # when first character is a backslash, check for \[ or \] combination
        '\[') ((++i)); has_width='' ; ;;
        '\]') ((++i)); has_width='y'; ;;

        # when 2nd character is a backslash we go forward 1 character only so the backslash is first on the next loop iteration
        ?'\')          [[ -n "${has_width}" ]] && length+=1; ;;

        # when 2nd character is not a backslash we can skip ahead
        ?*)   ((++i)); [[ -n "${has_width}" ]] && length+=2; ;;
    esac; done

    #offset="$(( "${terminal_width}" - "${length}" ))"
    #printf "$(tput cr)$(tput cuf "${offset}")${escaped_string@P}"

    local print_end_time="${EPOCHREALTIME/./}"
    printf >&3 "$(( (print_end_time - print_start_time) ))"
    return 0
}

function _print_right_adjusted_new() {
    local print_start_time="${EPOCHREALTIME/./}"

    local escaped_string="${1:?'need PS1-style string as 1st argument'}"
    local prefix='' has_width='y'
    local -i length=0

    IFS='\'
    for segment in ${escaped_string}; do

        # loop over string character by character and include the next character in the check ( 2 characters starting at ${i} )
        case "${prefix}${segment:0:1}" in
            # when first character is a backslash, check for \[ or \] combination
            '\[') has_width='' ;              ;;
            # remove the soon-to-be-added ']' from the length
            '\]') has_width='y'; ((length--)) ;;
            # add the missing '\' to the length
            '\'?)                ((length++)) ;;
        esac

        [[ -n "${has_width}" ]] && length+="${#segment}"

        prefix='\'
    done

    #local offset="$(( "${terminal_width}" - "${length}" ))"
    #printf "$(tput cr)$(tput cuf "${offset}")${escaped_string@P}"

    local print_end_time="${EPOCHREALTIME/./}"
    printf >&3 "$(( (print_end_time - print_start_time) ))"
    return 0
}

function _print_right_adjusted_perl() {
    local print_start_time="${EPOCHREALTIME/./}"

    # line with all zero-width parts escaped by \[ and \]
    local escaped_string="${1:?'need PS1-style string as 1st argument'}"
    local nonzero_string offset terminal_width=0

    # line with all zero-with parts removed, including the markers \[ and \] 
    nonzero_string="$(perl -pe 's|\\\[.*?\\\]||g' <<<"${escaped_string}")"

    #terminal_width="$(tput cols)"
    #offset="$(( "${terminal_width}" - "${#nonzero_line}" ))"

    # tput cr: "carriage return", literally.
    # tput cuf N: move cursor forward N times
    #printf "$(tput cr)$(tput cuf "${offset}" )${escaped_string@P}"

    local print_end_time="${EPOCHREALTIME/./}"
    printf >&3 "$(( (print_end_time - print_start_time) ))"
    return 0
}


# https://stackoverflow.com/a/56151840
function sort_with_header() {
    sed -u '1q'; sort "${@}"
}


input="normal\[$(tput smul)\]underlined\[$(tput rmul)$(tput bold)$(tput setaf 5)\]color stuff\[$(tput sgr0)\]normal"

printf "starting test\n"
{
    printf >&3 "name\tms\n" 
    for f in $(seq 1 10000); do
        #printf '.'
        printf >&3 "2step\t"
        _print_right_adjusted_2step "${input}"
        printf >&3 "\nperl\t"
        _print_right_adjusted_perl "${input}"
        printf >&3 "\nnew\t"
        _print_right_adjusted_new "${input}"
        printf >&3 "\n"
    done
}  3> timing.txt
printf "\ntesting is done!\n"

<timing.txt datamash -H --sort --group name median 2 mode 2 mean 2 pstdev 2 | sort_with_header -h -k2 | column -t | tee stats.txt

哪个输出:

GroupBy(name)  median(ms)  mode(ms)  mean(ms)   pstdev(ms)
new            58          57        60.9599    8.6375165406499
2step          237         236       245.2142   31.085467961091
perl           1003        997       1011.1479  41.703832265033

我称新版本是成功的:-)

相关内容