我想知道如何删除段落中的换行符这本书以及其他在kindle中使用的内容。期望的效果是将由空行分隔的每个块变成连续的文本行。我通过一系列复杂的 vim 替代命令完成了这本书的工作,但我宁愿尝试找到一种更好的方法来完成未来的工作。
我的希望是获得一个可以用于此目的的 vim、perl、sed 或 awk 脚本,但我对你们的想法持开放态度。
解决方案已经找到,但这里有一个示例输入输出,供将来使用谷歌搜索的人使用。
输入换行符:
Letter 1
_To Mrs. Saville, England._
St. Petersburgh, Dec. 11th, 17—.
You will rejoice to hear that no disaster has accompanied the
commencement of an enterprise which you have regarded with such evil
forebodings. I arrived here yesterday, and my first task is to assure
my dear sister of my welfare and increasing confidence in the success
of my undertaking.
I am already far north of London, and as I walk in the streets of
Petersburgh, I feel a cold northern breeze play upon my cheeks, which
braces my nerves and fills me with delight. Do you understand this
feeling? This breeze, which has travelled from the regions towards
which I am advancing, gives me a foretaste of those icy climes.
Inspirited by this wind of promise, my daydreams become more fervent
and vivid. I try in vain to be persuaded that the pole is the seat of
frost and desolation; it ever presents itself to my imagination as the
region of beauty and delight. There, Margaret, the sun is for ever
visible, its broad disk just skirting the horizon and diffusing a...
段落中不带换行符的输出:
_To Mrs. Saville, England._
St. Petersburgh, Dec. 11th, 17--.
You will rejoice to hear that no disaster has accompanied the commencement of an enterprise which you have regarded with such evil forebodings. I arrived here yesterday; and my first task is to assure my dear sister of my welfare, and increasing confidence in the success of my undertaking.
I am already far north of London; and as I walk in the streets of Petersburgh, I feel a cold northern breeze play upon my cheeks, which braces my nerves, and fills me with delight. Do you understand this feeling? This breeze, which has travelled from the regions towards which I am advancing, gives me a foretaste of those icy climes. Inspirited by this wind of promise, my day dreams become more fervent and vivid. I try in vain to be persuaded that the pole is the seat of frost and desolation; it ever presents itself to my imagination as the region of beauty and delight. There, Margaret, the sun is for ever visible; its broad disk just skirting the horizon, and diffusing a...
现在我最初出于好奇而使用的 vim 命令:
ggVG:norm A<space> -- adds a space to the end of each line
:%s/\v^\s*$/<++> -- swaps all blank lines with a unique temporary string
ggVGgJ -- joins all lines without adding a space
:%s/<++>/\r\r/g -- replaces all occurrences of my unique string with two newline characters
答案1
如果段落已被两个或多个换行符分隔,并且您只想删除每个段落内的换行符(或者更好的是,用空格替换换行符),则:
perl -00 -lpe 's/\n/ /g' pg42324.txt > pg42324-new.txt
-00
告诉 Perl 一次读取并处理输入的一个段落(段落边界是两个或多个换行符)-l
打开 Perl 对行结尾(或者在本例中为段落结尾)的自动处理-p
使 perl 运行类似于sed
- 即在脚本进行任何修改后读取并打印输入。-e
告诉 perl 下一个参数是要运行的脚本
有关这些选项的更多详细信息,请参阅man perlrun
。
或者,进行就地编辑(最初使用 .bak 扩展名进行备份):
perl -i.bak -00 -lpe 's/\n/ /g' pg42324.txt
如果段落内的任何行上有前导或尾随空格,您可能需要将多个空格替换为单个空格 - 添加; s/ +/ /g
到 perl 脚本:
perl -00 -lpe 's/\n/ /g; s/ +/ /g'
不过,在我看来,您最好将整个文件视为 markdown (甚至可能为粗体、斜体、章节标题等添加 markdown 格式)并使用潘多克或者将其从 markdown 转换为 epub 的东西。毕竟,Markdown 只是带有可选格式字符的纯文本。例如
pandoc pg42324.txt -o pg42324.epub
最小的编辑是仅打开文件vim
(或其他)并确保每个段落之间有一个空行。
顺便提一句,使用 pandoc 创建电子书是关于从文本或 Markdown 文件创建 .epub 书籍的简短但很好的总体介绍。
或者,更好的是,只需下载该书的 .epub 或 .mobi 版本,而不是纯文本版本 - 古腾堡计划提供多种格式的书籍。
有各种格式下载玛丽·雪莱的《弗兰肯斯坦》的链接:
答案2
请注意,awk
通过设置为空提供了所谓的“段落模式” RS
,这对于这种情况可能会派上用场。
GNUawk
的RT
自动变量可以捕获段落之间的实际记录分隔符,使其整洁紧凑:
gawk '{$1=$1; print $0 RT}' RS= ORS= pg42324.txt
RS
设置为空以启用段落模式。
ORS
设置为空以便RT
仅通过变量显式打印分隔符。
或者作为更正式正确的等效项,通过专用选项设置RS
和,因为放置在脚本之后的参数通常保留为输入文件名或脚本本身的参数:ORS
-v
gawk -v RS='' -v ORS='' '{$1=$1; print $0 RT}' pg42324.txt
答案3
如果您想标准化新行/换行:
wget https://www.gutenberg.org/cache/epub/42324/pg42324.txt
dos2unix pg42324.txt
perl -0777 -pe 's/\n{3,}/\n\n/g' pg42324.txt | less
如果你想就地编辑:
perl -0777 -i -pe 's/\n{2,}/\n\n/g' pg42324.txt
答案4
使用任何 awk:
$ cat tst.awk
NF { buf=buf $0 OFS; next }
{ prtBuf(); print }
END { prtBuf() }
function prtBuf() {
sub(OFS"$",ORS,buf)
printf "%s", buf
buf = ""
}
$ awk -f tst.awk letter
_To Mrs. Saville, England._
St. Petersburgh, Dec. 11th, 17—.
You will rejoice to hear that no disaster has accompanied the commencement of an enterprise which you have regarded with such evil forebodings. I arrived here yesterday, and my first task is to assure my dear sister of my welfare and increasing confidence in the success of my undertaking.
I am already far north of London, and as I walk in the streets of Petersburgh, I feel a cold northern breeze play upon my cheeks, which braces my nerves and fills me with delight. Do you understand this feeling? This breeze, which has travelled from the regions towards which I am advancing, gives me a foretaste of those icy climes. Inspirited by this wind of promise, my daydreams become more fervent and vivid. I try in vain to be persuaded that the pole is the seat of frost and desolation; it ever presents itself to my imagination as the region of beauty and delight. There, Margaret, the sun is for ever visible, its broad disk just skirting the horizon and diffusing a...