Bash 字符串按分隔符分割,并受字符数限制

Bash 字符串按分隔符分割,并受字符数限制

我想通过 bash 拆分一段用空格分隔的长文本,但失败了。下面的命令会拆分成字符,但不会拆分成分隔符。

echo "The quick fox jumped over the lazy dog" | fold -w 10
echo "The quick fox jumped over the lazy dog" | sed -e 's/.\{9\}/&\n/g'

如果它能用于一些用户 bash 交互那就太好了。

输入语法

format_text 10 "The quick fox jumped over the lazy dog"

输出:

The quick 
fox jumped 
over the 
lazy dog

您必须注意到,如果没有间距规则,第三行会将“lazy”中的“l”字母剪掉。

更新:目前的结果很好,工作切片器存在一些我自己无法解决的问题:它不会在超出限制之前破坏单词。

#!/bin/bash

printHeader () {
    declare -i line_length=$3
    
    # Upper and lower fences 
    local upper_command="print \"$1\" *" 
    local upper_fence="$(python -c "$upper_command $line_length")"
    
    local lower_command="print \"$2\" *"
    local lower_fence="$(python -c "$lower_command $line_length")"
    
    # Slice words by some chracter counter
    local regex_counter="s/(.{$line_length}) /\1\n/g"
    
    # Complete line with dots and a pipe
    local res="$line_length - length"
    local repeat_pattern='$(repeat res \".\"; echo)'
    local fill_command="{res=($res); printf \"%s%s|\n\", $0, $repeat_pattern}"

    echo "$upper_fence"

    sed -r "$regex_counter" <<< $4

    echo "$lower_fence"
}

printHeader "#" "#" 10 "The quick fox jumped over the lazy dog"

没有最终标记的当前输出:

##########
The quick fox
jumped over
the lazy dog
##########

答案1

sed -r 's/([^ .]+ [^ .]+) /\1\n/g' <<< "The quick fox jumped over the lazy dog"
The quick
fox jumped
over the
lazy dog

字符集[^ .]+表示一个或多个+任何类型的字符(空格.除外^)。因此,捕获组([^ .]+ [^ .]+)匹配以下模式string string。整个正则表达式末尾有一个额外的空格([^ .]+ [^ .]+)(可以将其包含在捕获组中以保留它)。

通过sed使用替换s命令,我们用第一个捕获组的内容\1和换行符(\n而不是空格)替换匹配的模式。通过标志,g我们将命令重复到每行的末尾。该-r选项激活扩展正则表达式。


更新-这是实际答案:

sed -r 's/(.{8}) /\1\n/g' <<< "How do we know it is going to match the pre-defined number of characters?"
How do we
know it is
going to
match the
pre-defined
number of
characters?

在此示例中,我们捕获长度至少为 8 个字符(包括空格)且后跟一个空格的字符串。我们可以按如下方式检查输出行的实际长度:

sed -r 's/(.{8}) /\1\n/g' <<< "How do we know it is going to match the pre-defined number of characters?" \
    | awk '{print length}'
9
10
8
9
11
9
11

并借助问题的答案如何使用 printf 多次打印一个字符?[awk]我们就能达到预期的结果。

sed -r 's/(.{8}) /\1\n/g' <<< "How do we know it is going to match the pre-defined number of characters?" \
    | awk '{rest=(12 - length); printf "%s%s|\n", $0, substr(".........", 1, rest)}'
How do we...|
know it is..|
going to....|
match the...|
pre-defined.|
number of...|
characters?.|

如果你想拆分单词,请从上面的正则表达式中删除最后的空格/(.{8})/。下面是一个例子,其中最大行长度恰好为 10 个字符或更少,其中第二个sed命令将修剪每个新行周围的空格。

sed -r 's/(.{10})/\1\n/g' <<< "How do we know it is going to match the pre-defined number of characters?" \
    | sed -r 's/(^ | $)//g' \
    | awk '{rest=(10 - length); printf "%s%s|\n", $0, substr(".........", 1, rest)}'
How do we.|
know it is|
going to..|
match the.|
pre-define|
d number o|
f characte|
rs?.......|

相关内容