使用 jq,从对象列表中提取字段和子字段,对成对的子字段进行分组以保存到 csv

使用 jq,从对象列表中提取字段和子字段,对成对的子字段进行分组以保存到 csv

有了这个数据:

[
  {
    "c": "A",
    "e": "B",
    "score": 0.99,
    "v": [
      {
        "context": "asdf",
        "score": 0.98,
        "url": "..."
      },
      {
        "context": "bcdfd",
        "score": 0.97,
        "url": "..."
      }
    ]
  },
  { 
    ...
  }
]

(注意外面的名单)

我正在寻找提取:

A, B, 0.99, asdf, 0.98, bcdfd, 0.97

所以,我能做的最好的就是

jq -r '.[] | [.c, .e, .score, .v[].context, .v[].score ] | @csv' 

这产生

A, B, 0.99, asdf, bcdfd, 0.998, 0.97

我知道.v[].context.v[score]只是吐出每组值,而不是将它们交织在一起。

我缺少什么魔法?

答案1

您想.context,.score对我认为的每个元素运行过滤器v

$ jq -r '.[] | [.c, .e, .score, (.v[] | .context,.score)] | @csv' file.json
"A","B",0.99,"asdf",0.98,"bcdfd",0.97

这相当于使用内置map函数而不将结果组装回数组。

答案2

下面为每个顶级数组元素创建一个 JSON 编码的 CSV 记录,然后提取并解码它们。对于每个顶级元素,子数组的值是通过“展平”数组来合并的。

jq -r 'map([ .c,.e,.score, (.v|map([.context, .score])) ] | flatten | @csv)[]' file

给定一个相当于以下内容的测试文档:

[
   {
      "c": "A",
      "e": "B",
      "score": 0.99,
      "v": [
         { "context": "asdf", "score": 0.98, "url": "..." },
         { "context": "bcdfd", "score": 0.97, "url": "..." }
      ]
   },
   {
      "c": "A",
      "e": "B",
      "score": 0.99,
      "v": [
         { "context": "asdf", "score": 0.98, "url": "..." },
         { "context": "asdf", "score": 0.98, "url": "..." },
         { "context": "bcdfd", "score": 0.97, "url": "..." }
      ]
   },
   {
      "c": "A",
      "e": "B",
      "score": 0.99,
      "v": [
         { "context": "asdf", "score": 0.98, "url": "..." },
         { "context": "asdf", "score": 0.98, "url": "..." },
         { "context": "asdf", "score": 0.98, "url": "..." },
         { "context": "bcdfd", "score": 0.97, "url": "..." }
      ]
   }
]

...我们得到

"A","B",0.99,"asdf",0.98,"bcdfd",0.97
"A","B",0.99,"asdf",0.98,"asdf",0.98,"bcdfd",0.97
"A","B",0.99,"asdf",0.98,"asdf",0.98,"asdf",0.98,"bcdfd",0.97

人们还可以对操作进行重新排序,以便一次使用该@csv运算符即可获取一组数组(而不是@csv在单个数组上重复使用):

jq -r 'map([ .c,.e,.score, (.v|map([.context, .score])) ] | flatten)[]|@csv' file

相关内容