文件合并到一个 json 文件中

文件合并到一个 json 文件中

我有点迷茫,不知道如何从几个文件创建 json。我有几个文件,例如:CLT.txt、LYO.txt……在我们里面我有日期 + 值:dd/mm/yyyy hh:mm xxxxx

我每 5 分钟在每个文件上添加新行

例如:CLT.txt

01/01/2020 00:00 45
01/01/2020 00:05 457
01/01/2020 00:10 458
01/01/2020 00:15 402
01/01/2020 00:20 585
...
02/01/2020 00:00 57
02/01/2020 00:05 86
02/01/2020 00:10 45
02/01/2020 00:15 402
02/01/2020 00:20 104
...

LYO.txt

01/01/2020 00:00 70
01/01/2020 00:05 221
01/01/2020 00:10 315
01/01/2020 00:15 57
01/01/2020 00:20 420
...
02/01/2020 00:00 50
02/01/2020 00:05 92
02/01/2020 00:10 32
02/01/2020 00:15 125
02/01/2020 00:20 10
...

我有大约 15 个这样的文件。

所以最后我想每 5 分钟创建一个具有以下格式的 json 文件:

{
  "CLT": {
    "01/01/2020": {
      "00:00": 45,
      "00:05": 457,
      "00:10": 458,
      "00:15": 402,
      "00:20": 585
...
    },
    "02/01/2020": {
      "00:00": 57,
      "00:05": 86,
      "00:10": 45,
      "00:15": 402,
      "00:20": 104
...
    }
  },
  "LYO": {
    "01/01/2020": {
      "00:00": 70,
      "00:05": 221,
      "00:10": 315,
      "00:15": 57,
      "00:20": 420
...
    },
    "02/01/2020": {
      "00:00": 50,
      "00:05": 92,
      "00:10": 32,
      "00:15": 125,
      "00:20": 10
...
    }
  }
}

如果你有一个简单的想法,我很感兴趣。所有过程都在 Ubuntu LTS 19 机器上完成。

谢谢

答案1

不要为此使用“bash 脚本”。选择一种对嵌套结构具有本机支持的语言。一旦有了这种支持,就可以轻松地将每个文件加载到形状与您想要的完全一致的字典中,然后一次性将整个文件转储为 JSON。

例如:

#!/usr/bin/env python3
import glob
import json

data = {}
for txtfile in glob.glob("???.txt"):
    code = txtfile.split("/")[-1].split(".")[0]
    data[code] = {}
    with open(txtfile, "r") as fh:
        for line in fh:
            date, time, count = line.strip().split()
            data[code].setdefault(date, {})
            data[code][date][time] = int(count)

print(json.dumps(data))

或者:

#!/usr/bin/env ruby
require 'json'

data = {}
Dir["???.txt"].each do |txtfile|
    code = File.basename(txtfile, ".txt")
    data[code] = {}
    File.open(txtfile, "r").each do |line|
        date, time, count = line.strip.split
        data[code][date] ||= {}
        data[code][date][time] = count.to_i
    end
end

puts JSON.generate(data)

或者:

#!/usr/bin/env perl
use File::Basename;
use JSON;

my $data = {};
for my $txtfile (glob("???.txt")) {
    my ($code) = basename($txtfile, ".txt");
    if (open(my $fh, "<", $txtfile)) {
        while (my $line = <$fh>) {
            my ($date, $time, $count) = ($line =~ /^(\S+) (\S+) (\S+)/);
            $data->{$code}->{$date}->{$time} = int $count;
        }
        close($fh);
    } else {
        die "cannot open $txtfile: $!";
    }
}

print JSON->new->encode($data);

答案2

awk是一个很好的工具。我本人并不是这方面的专家,但我想到了以下方法:

#!/bin/sh -e

while true; do
    echo "{" > new.json
    # put other file names here
    for tla in CLT LYO; do
        [ "$tla" = CLT ] || printf ",\n" >> new.json
        printf "\t\"%s\": {" "$tla" >> new.json
        awk '
BEGIN {
    date="other"
}
{
    curd=$1
    if(curd != date) {
        if(date != "other") {
            printf "\n\t\t},"
        }
        printf "\n\t\t\"%s\": {\n", curd
        date = curd
    } else {
        printf ",\n"
    }
    printf "\t\t\t\"%s\": \"%s\"", $2, $3
}
' "$tla.txt" >> new.json
        printf "\n\t\t}\n\t}" >> new.json
    done
    printf "\n}\n" >> new.json
    mv -f new.json output.json
    # repeat approximately every 5 minutes
    sleep 600
done

当然,sleep 600只有以交互方式调用该脚本时,循环才有意义。如果没有,只需删除外循环并休眠,然后从 cron 调用脚本(如果可能,强烈建议使用此变体)。如果需要,您还可以从脚本中删除“重定向”部分并调用重定向其所有输出的脚本(这实际上取决于您使用它的上下文)。

相关内容