使用 csplit （或类似工具）将文件拆分为 n 个文件

Question

你可以使用awk- 不完全是你想要的，但可能会成功。

想法：将 n 行打印到零件文件中，然后在创建新的零件文件之前搜索下一次出现的模式。

缺点：

如果您有大块并且只是跳过了此类块的开头，则某些文件可能会变得比其他文件大得多。
原始文件未删除（即所需空间的两倍）。
如所写，匹配线必须准确ABC（与同一行上的其他单词相比没有容差 - 可以调整）
通过设置行数而不是所需的输出文件数来工作（按输入文件的行数估计）

akw-脚本

BEGIN{
    outfile="part_"++i
    j=0
    }
{ 
    j++
    #block size set to at least 10 lines in this example
    #if threshold line number reached: search for next keyword,
    #then increase part file name counter and reset line threshold counter
    if ( j>=10 && $0 == "ABC" ) { outfile="part_"++i ; j=0 }
    print > outfile
}

执行通过

awk -f script.awk input.txt

Answer 1

你可以使用awk- 不完全是你想要的，但可能会成功。

想法：将 n 行打印到零件文件中，然后在创建新的零件文件之前搜索下一次出现的模式。

缺点：

如果您有大块并且只是跳过了此类块的开头，则某些文件可能会变得比其他文件大得多。
原始文件未删除（即所需空间的两倍）。
如所写，匹配线必须准确ABC（与同一行上的其他单词相比没有容差 - 可以调整）
通过设置行数而不是所需的输出文件数来工作（按输入文件的行数估计）

akw-脚本

BEGIN{
    outfile="part_"++i
    j=0
    }
{ 
    j++
    #block size set to at least 10 lines in this example
    #if threshold line number reached: search for next keyword,
    #then increase part file name counter and reset line threshold counter
    if ( j>=10 && $0 == "ABC" ) { outfile="part_"++i ; j=0 }
    print > outfile
}

执行通过

awk -f script.awk input.txt

使用 csplit （或类似工具）将文件拆分为 n 个文件

答案1

相关内容