Bash:尝试制作一个“sed-replace”函数,对任意字符输入(扩展、替换等)具有鲁棒性

Bash:尝试制作一个“sed-replace”函数,对任意字符输入(扩展、替换等)具有鲁棒性

我正在尝试制作一个以任意输入作为参数的“sed-replace”函数,但效果不佳。让我首先通过显示输入文件(简化文件)来说明问题:

$ cat /tmp/makefileTest
#(TEST CASE 1) bla bla line 1, relatively simple:
CFLAGS += -Wunused # this is a comment

#(TEST CASE 2) bla bla line 4, uses some expansion
cp $(OBJ_DIR)/$(EXE_NAME) /tmp/somewhere

#(TEST CASE 3) bla bla line 7, here is a complicated line ending in weird characters:
cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^

因此,我想将一些自定义内容应用于此输入文件(每次我“git pull”时),这意味着我有一个 pull 脚本,它会签出一个干净的副本,然后我有一个脚本,它会在最新版本的基础上进行必要的修改。下面的方法用于上面显示的测试用例 1 和测试用例 2,但问题是它涉及大量手动工作,因此我称之为“繁琐的方法”。我获取输入行,进行修改,然后 sed 函数应该进行必要的替换:

$ cat /tmp/testFunctionTedious.sh 
#!/usr/bin/env bash

# The old, tedious method, first defining input-file with test-cases:
inFile='/tmp/makefileTest'    

#    ----==== TEST-CASE 1 FROM THE INPUT FILE ====----
charsFoundFromGrep=$(grep -in 'CFLAGS += -Wunused # this is a comment' "$inFile" | wc -c)
if [ "$charsFoundFromGrep" = "0" ]; then
    echo "Custom makefile modification (CFLAGS += -Wunused # this is a comment) NOT found, doing nothing!"
elif [ "$charsFoundFromGrep" = "41" ]; then
    echo "Custom makefile modification (CFLAGS += -Wunused # this is a comment) found and will be applied..."
    sed -i 's/CFLAGS += -Wunused # this is a comment/CFLAGS += -Wall # here I changed something/g' "$inFile"
else
    echo "ERROR: Unhandled custom makefile modification (CFLAGS += -Wunused # this is a comment), please fix..."
    exit 1
fi

#    ----==== TEST-CASE 2 FROM THE INPUT FILE ====----
# Notice below that I need to escape $(OBJ_DIR) and $(EXE_NAME), not to
#  mention the two forward slashes in the "sed"-line, it's definately not just "plug-and-play":
charsFoundFromGrep=$(grep -in 'cp $(OBJ_DIR)/$(EXE_NAME)' "$inFile" | wc -c)
if [ "$charsFoundFromGrep" = "0" ]; then
    echo "Custom makefile modification (cp \$(OBJ_DIR)/\$(EXE_NAME)) NOT found, doing nothing!"
elif [ "$charsFoundFromGrep" = "43" ]; then
    echo "Custom makefile modification (cp \$(OBJ_DIR)/\$(EXE_NAME)) found and will be applied..."
    sed -i 's/cp \$(OBJ_DIR)\/\$(EXE_NAME)/cp \$(OBJ_DIR)\/\$(EXE_NAME_NEW)/g' "$inFile"
else
    echo "ERROR: Unhandled custom makefile modification (cp $(OBJ_DIR)/$(EXE_NAME)), please fix..."
    exit 1
fi

我正在尝试学习制作更好/更智能的方法,并学习有关 bash 变量扩展/替换和特殊字符处理的知识。为了提高效率,我尝试创建以下脚本,但事情变得太复杂了:

$ cat /tmp/testFunction.sh 
#!/usr/bin/env bash

# The method I struggle with and ask for help with, first defining input-file with test-cases
inFile='/tmp/makefileTest'

# *** Defining a sedReplace-function below ***
#   First arg: Search (input) string
#   Second arg: Replacement (output) string
#   Third arg: Expected number of characters using 'grep -in "$1" "$inFile" | wc -c)',
#      this is just to ensure the line I'm going to run sed on didn't change, otherwise
#      output and error involving the input message (hence the string comparison that
#      relates argument 3 with grep from argument 1 (the input string).
sedReplace(){
    # sed -i 's/$1/$2/g' "$inFile"
    charsFoundFromGrep=$(grep -in "$1" "$inFile" | wc -c)
    if [ "$3" == "$charsFoundFromGrep" ]; then
        # Getting the line below right is REALLY difficult for me!
        execLine="sed -i 's/$1/$2/g' \"$inFile\""
        # Printing the line, so I can see it before executing the line:
        echo "$execLine"
        # Executing the line if ok (disabled as it doesn't work at the moment):
        #$($execLine)
    else
        echo "ERROR: Unhandled custom makefile modification (expected: $1)), please fix..."
        exit 1
    fi
}

# And below the function is used (1st arg is input, 2nd arg is sed-
#   output and 3rd arg is grep comparison word count):

#    ----==== TEST-CASE 1 FROM THE INPUT FILE ====----
sedReplace 'CFLAGS += -Wunused # this is a comment' 'CFLAGS += -Wall # here I changed something' 41

#    ----==== TEST-CASE 2 FROM THE INPUT FILE ====----
#sedReplace 'cp $(OBJ_DIR)/$(EXE_NAME)' 'cp $(OBJ_DIR)/$(EXE_NAME_NEW)' 43

#    ----==== TEST-CASE 3 FROM THE INPUT FILE ====----
# Once the above 2 cases work, here's the last test-case to try the sedReplace function on (the hardest, I imagine):
# And here grep don't work, due to the special characters
#sedReplace 'cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^' 'cd $(SOME_UTIL_BIN); ./someOTHERcommand $(BUILD_DIRECTORY_SOMETHING_ELSE)/$(OBJ_DIR)/\$\^'

您很容易就会发现最后一个脚本不起作用。我尝试过在 Google 上搜索很多类似的问题,但找不到。我不知道如何完成我的 sed 函数。这就是我寻求帮助的原因。有资格和兴趣的人应该能够完全按照这里显示的方式运行脚本和输入文件,我期待看到是否有人可以解决这个问题。

答案1

这是脚本的修改版本,仅适用于第一个测试用例:

#!/usr/bin/env bash

inFile='/tmp/makefileTest'

sedReplace(){
    charsFoundFromGrep="$(grep -in "$1" "$inFile" | wc -c)"

    if [ "$3" == "$charsFoundFromGrep" ]; then
        # 1. The single quotes inside double quotes are threat as regular characters
        # 2. During the assignment, the variables $1, $2 and $inFile will be expanded
        # 3. The variable $execLine will have the following value:
        #    sed -i 's/CFLAGS += -Wunused # this is a comment/CFLAGS += -Wall # here I changed something/g' '/tmp/makefileTest'
        execLine="sed -i 's/$1/$2/g' '$inFile'"

        # We need 'eval' to convert the variable to a command in this case,
        # because the value of the variable contains spaces, quotes, slashes, etc.
        eval "$execLine"
    else
        echo "ERROR: Unhandled custom makefile modification (expected: $1)), please fix..."
        exit 1
    fi
}

sedReplace 'CFLAGS += -Wunused # this is a comment' 'CFLAGS += -Wall # here I changed something' '41'

上面的例子中使用了命令eval,最近我们在最后一部分讨论了它的用法、优缺点这个答案eval以及相关评论。如果可能的话,最好避免使用,因此我的下一个建议是:

#!/usr/bin/env bash

sedReplace(){
    # 1. Note we do not need to store the output of the command substitution $()
    #    into a variable in order to use it within a test condition.
    # 2. Here is used the bash's double square brackets test [[, so
    #    we do not need to quote the variable before the condition.
    #    If the statement after the condition is not quoted the (expanded) value
    #    will be threat as regexp. Currently it is treated as string.
    if [[ $3 == "$(grep -in "$1" "$inFile" | wc -c)" ]]
    then
        # 1. Note the double quotes here.
        # 2. The sed's /g flag is removed, because, IMO, we don't need it in this case at all.
        sed -i "s/$1/$2/" "$inFile"

    else
        echo "ERROR: Unhandled custom makefile modification (expected: $1)), please fix..."
        exit 1
    fi
}

# here are used double quotes in case the value comes from a variable in the further versions
inFile="/tmp/makefileTest"

sedReplace 'CFLAGS += -Wunused # this is a comment' 'CFLAGS += -Wall # here I changed something' '41'

上述示例仍然仅适用于第一个测试用例。对于其余测试用例,我们需要使用grep -F固定字符串 (参考)。此外,在使用之前,我们需要替换搜索字符串/模式中的某些字符sed(可能有更优雅的解决方案,但我找不到)。我们需要做的第三件事是将sed的分隔符从更改/为字符串中未使用的任何字符 - 在下面的示例中使用的是:

此外,我还将输入文件的名称作为位置参数,并将位置参数分配给局部变量以便于阅读。

这是最终的解决方案(取消注释-i以进行实际更改):

#!/usr/bin/env bash
sedReplace() {
    local the_searched_string="$1" the_replacement="$2" the_lenght="$3" the_file="$4"

    if [[ $the_lenght == "$(grep -inF "$the_searched_string" "$the_file" | wc -c)" ]]
    then
        the_searched_string="$(sed -r 's/(\^|\$|\\)/./g' <<< "$the_searched_string")" # replace some special characters within the searched string by any char '.'
        sed "s:$the_searched_string:$the_replacement:" "$the_file" #-i
    else
        echo "ERROR: Unhandled custom makefile modification (expected: ${the_searched_string})..."
        exit 1
    fi
}

inFile="/tmp/makefileTest" 

# Test all cases:
echo -e '\n\n# --- Test case 1 -----'
the_string='CFLAGS += -Wunused # this is a comment'
sedReplace "$the_string" \
           'CFLAGS += -Wall # something is changed' \
           "$(wc -c <<< '2:'"$the_string")" \
           "$inFile"
echo -e '\n\n# --- Test case 2 -----'
the_string='cp $(OBJ_DIR)/$(EXE_NAME) /tmp/somewhere'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '5:'"$the_string")" \
            "$inFile"
echo -e '\n\n# --- Test case 3 -----'
the_string='cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '8:'"$the_string")" \
           "$inFile"

也许,根据您的需要,您可以使用sha256sum(或其他校验和工具)来 wc -c进行更严格的线路检查:

#!/usr/bin/env bash
sedReplace() {
    local the_searched_string="$1" the_replacement="$2" the_lenght="$3" the_file="$4"

    if [[ $the_lenght == "$(grep -inF "$the_searched_string" "$the_file" | sha256sum)" ]]
    then
        the_searched_string="$(sed -r 's/(\^|\$|\\)/./g' <<< "$the_searched_string")" # replace some special characters within the searched string by any char '.'
        sed "s:$the_searched_string:$the_replacement:" "$the_file" #-i
    else
        echo "ERROR: Unhandled custom makefile modification (expected: ${the_searched_string})..."
        exit 1
    fi
}

inFile="/tmp/makefileTest"

# Test all cases:
echo -e '\n\n# --- Test case 1 -----'
the_string='CFLAGS += -Wunused # this is a comment'; the_line='2'
sedReplace "$the_string" \
           'CFLAGS += -Wall # something is changed' \
           "$(sha256sum <<< "${the_line}:${the_string}")" \
           "$inFile"
echo -e '\n\n# --- Test case 2 -----'
the_string='cp $(OBJ_DIR)/$(EXE_NAME) /tmp/somewhere'; the_line='5'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(sha256sum <<< "${the_line}:${the_string}")" \
            "$inFile"
echo -e '\n\n# --- Test case 3 -----'
the_string='cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^'; the_line='8'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(sha256sum <<< "${the_line}:${the_string}")" \
           "$inFile"

更新:因为搜索的字符串非常复杂,这里有一个如何动态计算分隔符的示例(注意第 4 个测试用例):

#!/usr/bin/env bash
sedReplace() {
    local the_searched_string="$1" the_replacement="$2" the_lenght="$3" the_file="$4" d="$5"

    if [[ $the_lenght == "$(grep -inF "$the_searched_string" "$the_file" | wc -c)" ]]
    then
        the_searched_string="$(sed -r 's/(\^|\$|\\)/./g' <<< "$the_searched_string")" # replace some special characters within the searched string by any char '.'
        the_expression="s${d}${the_searched_string}${d}${the_replacement}${d}"
        #echo "$the_expression"
        sed "$the_expression" "$the_file" #-i
    else
        echo "ERROR: Unhandled custom makefile modification (expected: ${the_searched_string})..."
        exit 1
    fi
}

get_delimiter() {
    unset delimiter

    for d in '/' ':' '#' '_' '|' '@'
    do
        if ! grep -qoF "$d" <<< "$the_string"
        then
            delimiter="$d"
            break
        fi
    done

    if [[ -z $delimiter ]]
    then
        echo 'There is not appropriate delimiter for the string:'
        echo "$the_string"
        exit 1
    fi
}

inFile="/tmp/makefileTest"

# Test all cases:
echo -e '\n\n# --- Test case 1 -----'
the_string='CFLAGS += -Wunused # this is a comment'
get_delimiter
sedReplace "$the_string" \
           'CFLAGS += -Wall # something is changed' \
           "$(wc -c <<< '2:'"$the_string")" \
           "$inFile" "$delimiter"

echo -e '\n\n# --- Test case 2 -----'
the_string='cp $(OBJ_DIR)/$(EXE_NAME) /tmp/somewhere'
get_delimiter
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '5:'"$the_string")" \
            "$inFile" "$delimiter"

echo -e '\n\n# --- Test case 3 -----'
the_string='cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^'
get_delimiter
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '8:'"$the_string")" \
           "$inFile" "$delimiter"

echo -e '\n\n# --- Test case 4 -----'
the_string='/:#_|@'
get_delimiter
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '8:'"$the_string")" \
           "$inFile" "$delimiter"

以下是上述内容的另一个版本:

#!/usr/bin/env bash
sedReplace() {
    local d the_searched_string="$1" the_replacement="$2" the_lenght="$3" the_file="$4"
    # the content of this function could be placed here, thus we will have only one function
    get_delimiter "$the_searched_string"

    if [[ $the_lenght == "$(grep -inF "$the_searched_string" "$the_file" | wc -c)" ]]
    then
        the_searched_string="$(sed -r 's/(\^|\$|\\)/./g' <<< "$the_searched_string")"
        the_expression="s${d}${the_searched_string}${d}${the_replacement}${d}"
        sed "$the_expression" "$the_file" #-i
    else
        echo "ERROR: Unhandled custom makefile modification (expected: ${the_searched_string})..."
        exit 1
    fi
}

get_delimiter() {
    # define an array of possible delimiters, it could be defined outside the function
    delimiters=('/' ':' '#' '_' '|' '@' '%')

    for delimiter in ${delimiters[@]}
    do
        if ! grep -qoF "$delimiter" <<< "$1"
        then
            d="$delimiter"
            break
        fi
    done

    if [[ -z $d ]]
    then
        echo "ERROR: There is not appropriate delimiter for the string: ${1}"
        exit 1
    fi
}

inFile="/tmp/makefileTest"

# Test all cases:
echo -e '\n\n# --- Test case 1 -----'
the_string='CFLAGS += -Wunused # this is a comment'
sedReplace "$the_string" \
           'CFLAGS += -Wall # something is changed' \
           "$(wc -c <<< '2:'"$the_string")" \
           "$inFile"

echo -e '\n\n# --- Test case 2 -----'
the_string='cp $(OBJ_DIR)/$(EXE_NAME) /tmp/somewhere'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '5:'"$the_string")" \
            "$inFile"

echo -e '\n\n# --- Test case 3 -----'
the_string='cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '8:'"$the_string")" \
           "$inFile"

echo -e '\n\n# --- Test case 4 -----'
the_string='/:#_|@%'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '8:'"$the_string")" \
           "$inFile"

答案2

这不一定直接回答问题,但你可能想看看其他类似的简化尝试sed,其中一个例子是sd

相关内容