根据模式将文件拆分为单独的 Excel 单元格

根据模式将文件拆分为单独的 Excel 单元格

我有一个长达几百行的大文件。该文件由特定分隔符“###”划分为多个部分。这行“###”出现在几行之后,每次可能会有所不同。因此,我需要“###”之前的内容位于单个 Excel 文件的一个单元格中,而“###”之后的内容位于在同一 Excel 文件的单独单元格中。在此输入图像描述我熟悉 split 和 awk,但似乎无法创建一个命令行来执行我所描述的操作,有什么想法吗?

答案1

创建一个可执行脚本 test.awk,包含以下内容:

awk '
  # { system( "echo \"" $0 "\" >&2") }
  BEGIN { R = "\"" }
  /^###/ {
    sub(/\n$/,"",R)
    print R "\""
    R = "\""
    next
  }
  {
    sub(/\n$/,"")
    gsub(/"/,"\"\"")
    R = R $0 "\n"
  }
' $@

然后运行

test.awk  longfile.txt  > longfile.csv

使用 libreoffice Calc 打开 longfile.csv。

长文件.txt:

dkdkdkdk
qsdfqlsdf
qsdfjqlsdf
######
qdfqj
qsdfmlkjqsd
qsiapriopazeiru
wqsdfqesr
######
rurururururururuur
rururururururururu
ururururururururur
######
iiiiiiiiiiii
iiiiiiiiiiii
iiiiiiiiiiii
iiiiiiiiiiii
iiiiiiiiiiii
iiiiiiiiiiii
######
uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu
######

长文件.csv:

"dkdkdkdk
qsdfqlsdf
qsdfjqlsdf"
"qdfqj
qsdfmlkjqsd
qsiapriopazeiru
wqsdfqesr"
"rurururururururuur
rururururururururu
ururururururururur"
"iiiiiiiiiiii
iiiiiiiiiiii
iiiiiiiiiiii
iiiiiiiiiiii
iiiiiiiiiiii
iiiiiiiiiiii"
"uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu
uuuuuuuuuuu"

相关内容