将 200 多个大文件中的列合并到一张表中

将 200 多个大文件中的列合并到一张表中

我有 200 多个大文件,它们都只有 1 列和 76M 行。我想启动一个 newfile.txt 并将各列放在一起(将文件 1 的第 1 行与文件 2 的第 1 行匹配...并继续添加直到 200 的第 1 行)。然后对所有行重复此操作。我正在为此苦苦挣扎。有什么建议么?

我尝试过吉尔斯和格伦斯的答案这里这里但我不知道如何循环并重复将制表符分隔的列添加到输出 newfile.txt 中。我只能使用不将文件存储在内存中的方法(最终文件应该是120GB+)。

谢谢

答案1

耐心一点,假设有 200 个名为“file{1..200}”的文件,每个文件有 76,000,000 行:

for ((line=1; line <= 76000000; line++))
do 
  printf "%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n" "$(sed -n ${line}p "file1")" "$(sed -n ${line}p "file2")" "$(sed -n ${line}p "file3")" "$(sed -n ${line}p "file4")" "$(sed -n ${line}p "file5")" "$(sed -n ${line}p "file6")" "$(sed -n ${line}p "file7")" "$(sed -n ${line}p "file8")" "$(sed -n ${line}p "file9")" "$(sed -n ${line}p "file10")" "$(sed -n ${line}p "file11")" "$(sed -n ${line}p "file12")" "$(sed -n ${line}p "file13")" "$(sed -n ${line}p "file14")" "$(sed -n ${line}p "file15")" "$(sed -n ${line}p "file16")" "$(sed -n ${line}p "file17")" "$(sed -n ${line}p "file18")" "$(sed -n ${line}p "file19")" "$(sed -n ${line}p "file20")" "$(sed -n ${line}p "file21")" "$(sed -n ${line}p "file22")" "$(sed -n ${line}p "file23")" "$(sed -n ${line}p "file24")" "$(sed -n ${line}p "file25")" "$(sed -n ${line}p "file26")" "$(sed -n ${line}p "file27")" "$(sed -n ${line}p "file28")" "$(sed -n ${line}p "file29")" "$(sed -n ${line}p "file30")" "$(sed -n ${line}p "file31")" "$(sed -n ${line}p "file32")" "$(sed -n ${line}p "file33")" "$(sed -n ${line}p "file34")" "$(sed -n ${line}p "file35")" "$(sed -n ${line}p "file36")" "$(sed -n ${line}p "file37")" "$(sed -n ${line}p "file38")" "$(sed -n ${line}p "file39")" "$(sed -n ${line}p "file40")" "$(sed -n ${line}p "file41")" "$(sed -n ${line}p "file42")" "$(sed -n ${line}p "file43")" "$(sed -n ${line}p "file44")" "$(sed -n ${line}p "file45")" "$(sed -n ${line}p "file46")" "$(sed -n ${line}p "file47")" "$(sed -n ${line}p "file48")" "$(sed -n ${line}p "file49")" "$(sed -n ${line}p "file50")" "$(sed -n ${line}p "file51")" "$(sed -n ${line}p "file52")" "$(sed -n ${line}p "file53")" "$(sed -n ${line}p "file54")" "$(sed -n ${line}p "file55")" "$(sed -n ${line}p "file56")" "$(sed -n ${line}p "file57")" "$(sed -n ${line}p "file58")" "$(sed -n ${line}p "file59")" "$(sed -n ${line}p "file60")" "$(sed -n ${line}p "file61")" "$(sed -n ${line}p "file62")" "$(sed -n ${line}p "file63")" "$(sed -n ${line}p "file64")" "$(sed -n ${line}p "file65")" "$(sed -n ${line}p "file66")" "$(sed -n ${line}p "file67")" "$(sed -n ${line}p "file68")" "$(sed -n ${line}p "file69")" "$(sed -n ${line}p "file70")" "$(sed -n ${line}p "file71")" "$(sed -n ${line}p "file72")" "$(sed -n ${line}p "file73")" "$(sed -n ${line}p "file74")" "$(sed -n ${line}p "file75")" "$(sed -n ${line}p "file76")" "$(sed -n ${line}p "file77")" "$(sed -n ${line}p "file78")" "$(sed -n ${line}p "file79")" "$(sed -n ${line}p "file80")" "$(sed -n ${line}p "file81")" "$(sed -n ${line}p "file82")" "$(sed -n ${line}p "file83")" "$(sed -n ${line}p "file84")" "$(sed -n ${line}p "file85")" "$(sed -n ${line}p "file86")" "$(sed -n ${line}p "file87")" "$(sed -n ${line}p "file88")" "$(sed -n ${line}p "file89")" "$(sed -n ${line}p "file90")" "$(sed -n ${line}p "file91")" "$(sed -n ${line}p "file92")" "$(sed -n ${line}p "file93")" "$(sed -n ${line}p "file94")" "$(sed -n ${line}p "file95")" "$(sed -n ${line}p "file96")" "$(sed -n ${line}p "file97")" "$(sed -n ${line}p "file98")" "$(sed -n ${line}p "file99")" "$(sed -n ${line}p "file100")" "$(sed -n ${line}p "file101")" "$(sed -n ${line}p "file102")" "$(sed -n ${line}p "file103")" "$(sed -n ${line}p "file104")" "$(sed -n ${line}p "file105")" "$(sed -n ${line}p "file106")" "$(sed -n ${line}p "file107")" "$(sed -n ${line}p "file108")" "$(sed -n ${line}p "file109")" "$(sed -n ${line}p "file110")" "$(sed -n ${line}p "file111")" "$(sed -n ${line}p "file112")" "$(sed -n ${line}p "file113")" "$(sed -n ${line}p "file114")" "$(sed -n ${line}p "file115")" "$(sed -n ${line}p "file116")" "$(sed -n ${line}p "file117")" "$(sed -n ${line}p "file118")" "$(sed -n ${line}p "file119")" "$(sed -n ${line}p "file120")" "$(sed -n ${line}p "file121")" "$(sed -n ${line}p "file122")" "$(sed -n ${line}p "file123")" "$(sed -n ${line}p "file124")" "$(sed -n ${line}p "file125")" "$(sed -n ${line}p "file126")" "$(sed -n ${line}p "file127")" "$(sed -n ${line}p "file128")" "$(sed -n ${line}p "file129")" "$(sed -n ${line}p "file130")" "$(sed -n ${line}p "file131")" "$(sed -n ${line}p "file132")" "$(sed -n ${line}p "file133")" "$(sed -n ${line}p "file134")" "$(sed -n ${line}p "file135")" "$(sed -n ${line}p "file136")" "$(sed -n ${line}p "file137")" "$(sed -n ${line}p "file138")" "$(sed -n ${line}p "file139")" "$(sed -n ${line}p "file140")" "$(sed -n ${line}p "file141")" "$(sed -n ${line}p "file142")" "$(sed -n ${line}p "file143")" "$(sed -n ${line}p "file144")" "$(sed -n ${line}p "file145")" "$(sed -n ${line}p "file146")" "$(sed -n ${line}p "file147")" "$(sed -n ${line}p "file148")" "$(sed -n ${line}p "file149")" "$(sed -n ${line}p "file150")" "$(sed -n ${line}p "file151")" "$(sed -n ${line}p "file152")" "$(sed -n ${line}p "file153")" "$(sed -n ${line}p "file154")" "$(sed -n ${line}p "file155")" "$(sed -n ${line}p "file156")" "$(sed -n ${line}p "file157")" "$(sed -n ${line}p "file158")" "$(sed -n ${line}p "file159")" "$(sed -n ${line}p "file160")" "$(sed -n ${line}p "file161")" "$(sed -n ${line}p "file162")" "$(sed -n ${line}p "file163")" "$(sed -n ${line}p "file164")" "$(sed -n ${line}p "file165")" "$(sed -n ${line}p "file166")" "$(sed -n ${line}p "file167")" "$(sed -n ${line}p "file168")" "$(sed -n ${line}p "file169")" "$(sed -n ${line}p "file170")" "$(sed -n ${line}p "file171")" "$(sed -n ${line}p "file172")" "$(sed -n ${line}p "file173")" "$(sed -n ${line}p "file174")" "$(sed -n ${line}p "file175")" "$(sed -n ${line}p "file176")" "$(sed -n ${line}p "file177")" "$(sed -n ${line}p "file178")" "$(sed -n ${line}p "file179")" "$(sed -n ${line}p "file180")" "$(sed -n ${line}p "file181")" "$(sed -n ${line}p "file182")" "$(sed -n ${line}p "file183")" "$(sed -n ${line}p "file184")" "$(sed -n ${line}p "file185")" "$(sed -n ${line}p "file186")" "$(sed -n ${line}p "file187")" "$(sed -n ${line}p "file188")" "$(sed -n ${line}p "file189")" "$(sed -n ${line}p "file190")" "$(sed -n ${line}p "file191")" "$(sed -n ${line}p "file192")" "$(sed -n ${line}p "file193")" "$(sed -n ${line}p "file194")" "$(sed -n ${line}p "file195")" "$(sed -n ${line}p "file196")" "$(sed -n ${line}p "file197")" "$(sed -n ${line}p "file198")" "$(sed -n ${line}p "file199")" "$(sed -n ${line}p "file200")"
done > newfile.txt

相关内容