我想通过 bash 拆分一段用空格分隔的长文本,但失败了。下面的命令会拆分成字符,但不会拆分成分隔符。
echo "The quick fox jumped over the lazy dog" | fold -w 10
echo "The quick fox jumped over the lazy dog" | sed -e 's/.\{9\}/&\n/g'
如果它能用于一些用户 bash 交互那就太好了。
输入语法
format_text 10 "The quick fox jumped over the lazy dog"
输出:
The quick
fox jumped
over the
lazy dog
您必须注意到,如果没有间距规则,第三行会将“lazy”中的“l”字母剪掉。
更新:目前的结果很好,工作切片器存在一些我自己无法解决的问题:它不会在超出限制之前破坏单词。
#!/bin/bash
printHeader () {
declare -i line_length=$3
# Upper and lower fences
local upper_command="print \"$1\" *"
local upper_fence="$(python -c "$upper_command $line_length")"
local lower_command="print \"$2\" *"
local lower_fence="$(python -c "$lower_command $line_length")"
# Slice words by some chracter counter
local regex_counter="s/(.{$line_length}) /\1\n/g"
# Complete line with dots and a pipe
local res="$line_length - length"
local repeat_pattern='$(repeat res \".\"; echo)'
local fill_command="{res=($res); printf \"%s%s|\n\", $0, $repeat_pattern}"
echo "$upper_fence"
sed -r "$regex_counter" <<< $4
echo "$lower_fence"
}
printHeader "#" "#" 10 "The quick fox jumped over the lazy dog"
没有最终标记的当前输出:
##########
The quick fox
jumped over
the lazy dog
##########
答案1
sed -r 's/([^ .]+ [^ .]+) /\1\n/g' <<< "The quick fox jumped over the lazy dog"
The quick
fox jumped
over the
lazy dog
字符集[^ .]+
表示一个或多个+
任何类型的字符(空格.
除外^
)。因此,捕获组([^ .]+ [^ .]+)
匹配以下模式string string
。整个正则表达式末尾有一个额外的空格([^ .]+ [^ .]+)
(可以将其包含在捕获组中以保留它)。
通过sed
使用替换s
命令,我们用第一个捕获组的内容\1
和换行符(\n
而不是空格)替换匹配的模式。通过标志,g
我们将命令重复到每行的末尾。该-r
选项激活扩展正则表达式。
更新-这是实际答案:
sed -r 's/(.{8}) /\1\n/g' <<< "How do we know it is going to match the pre-defined number of characters?"
How do we
know it is
going to
match the
pre-defined
number of
characters?
在此示例中,我们捕获长度至少为 8 个字符(包括空格)且后跟一个空格的字符串。我们可以按如下方式检查输出行的实际长度:
sed -r 's/(.{8}) /\1\n/g' <<< "How do we know it is going to match the pre-defined number of characters?" \
| awk '{print length}'
9
10
8
9
11
9
11
并借助问题的答案如何使用 printf 多次打印一个字符?[awk]我们就能达到预期的结果。
sed -r 's/(.{8}) /\1\n/g' <<< "How do we know it is going to match the pre-defined number of characters?" \
| awk '{rest=(12 - length); printf "%s%s|\n", $0, substr(".........", 1, rest)}'
How do we...|
know it is..|
going to....|
match the...|
pre-defined.|
number of...|
characters?.|
如果你想拆分单词,请从上面的正则表达式中删除最后的空格/(.{8})/
。下面是一个例子,其中最大行长度恰好为 10 个字符或更少,其中第二个sed
命令将修剪每个新行周围的空格。
sed -r 's/(.{10})/\1\n/g' <<< "How do we know it is going to match the pre-defined number of characters?" \
| sed -r 's/(^ | $)//g' \
| awk '{rest=(10 - length); printf "%s%s|\n", $0, substr(".........", 1, rest)}'
How do we.|
know it is|
going to..|
match the.|
pre-define|
d number o|
f characte|
rs?.......|