我有一个 JSON 输出,需要在 Linux 中从中提取一些参数。
这是 JSON 输出:
{
items:[
{
provider_name:"ucp-ipg",
subject_name:"rtm-instrumentation",
dataset_name:"rtm-instrumentation-dataset-hour-sliced",
dataset_key:[
2018-03-06T06:00:00Z,
"000394e3-a9eb-40b6-9463-fbd588d20ba4"
],
record_count:21,
state:"complete",
version:0,
etag:"a221df62",
creation_timestamp:2018-03-06T06:10:46.294-00:00,
created_by:"AAA",
modification_timestamp:2018-03-06T06:10:46.294-00:00,
modified_by:"AAA"
},
{
provider_name:"ucp-ipg",
subject_name:"rtm-instrumentation",
dataset_name:"rtm-instrumentation-dataset-hour-sliced",
dataset_key:[
2018-03-06T06:00:00Z,
"00097722-b02f-4938-bd4b-d935047c3837"
],
record_count:17,
state:"complete",
version:0,
etag:"aa4dbc25",
creation_timestamp:2018-03-06T06:12:23.293-00:00,
created_by:"AAA",
modification_timestamp:2018-03-06T06:12:23.293-00:00,
modified_by:"AAA"
}
我想要的输出
dataset_key:[
2018-03-06T06:00:00Z,
"00097722-b02f-4938-bd4b-d935047c3837"
]
我已经尝试过以下但不起作用:
file.txt | python -mjson.tool | grep 'dataset_key'
答案1
假设 JSON 文档格式正确且完整,例如
{
"items": [
{
"provider_name": "ucp-ipg",
"subject_name": "rtm-instrumentation",
"dataset_name": "rtm-instrumentation-dataset-hour-sliced",
"dataset_key": [
"2018-03-06T06:00:00Z",
"000394e3-a9eb-40b6-9463-fbd588d20ba4"
],
"record_count": 21,
"state": "complete",
"version": 0,
"etag": "a221df62",
"creation_timestamp": "2018-03-06T06:10:46.294-00:00",
"created_by": "AAA",
"modification_timestamp": "2018-03-06T06:10:46.294-00:00",
"modified_by": "AAA"
},
{
"provider_name": "ucp-ipg",
"subject_name": "rtm-instrumentation",
"dataset_name": "rtm-instrumentation-dataset-hour-sliced",
"dataset_key": [
"2018-03-06T06:00:00Z",
"00097722-b02f-4938-bd4b-d935047c3837"
],
"record_count": 17,
"state": "complete",
"version": 0,
"etag": "aa4dbc25",
"creation_timestamp": "2018-03-06T06:12:23.293-00:00",
"created_by": "AAA",
"modification_timestamp": "2018-03-06T06:12:23.293-00:00",
"modified_by": "AAA"
}
]
}
第二个item
元素的dataset_key
数组可以是jq
:
$ jq '.items[1].dataset_key' file.json
[
"2018-03-06T06:00:00Z",
"00097722-b02f-4938-bd4b-d935047c3837"
]
更改[1]
为[-1]
以获取dataset_key
来自最后的 item
元素。
获取数组元素的原始数据:
$ jq -r '.items[1].dataset_key[]' file.json
2018-03-06T06:00:00Z
00097722-b02f-4938-bd4b-d935047c3837
答案2
如果您无法使 json 有效,例如对于您无法控制的 API 的输出,这将返回您想要的输出:
perl -0777 -ne '/(dataset_key:\[[^\]]*\])/ && print "$1\n"' file.txt
注意:如果该项目包含
]
.
答案3
从 json 中提取信息的简单方法是jtc
(假设你的json是固定的):
bash $ jtc -w '<dataset_key>l+0' -r your.json
[ "2018-03-06T06:00:00Z", "000394e3-a9eb-40b6-9463-fbd588d20ba4" ]
[ "2018-03-06T06:00:00Z", "00097722-b02f-4938-bd4b-d935047c3837" ]
bash $