我有数千个 JSON 文件,如下所示:
文件1 ( key1:value_list1
)
{"2mac:acg":["1-248","3-245","3-246","4-245","4-246","5-245","5-246","6-243","6-245","6-246","6-247","6-296","7-245","7-295","7-296","8-236","8-239","8-240","8-294","8-295","8-296","9-235","9-236","9-239","9-294","10-293","10-294","10-295","11-15","11-16","11-293","11-294","12-16","12-290","12-291","12-292","12-293","12-294","13-25","13-26","13-27","13-28","13-290","13-292","13-293","14-24","14-25","14-26","14-27","14-290","15-24","15-25","16-24","16-25","16-233","16-234","16-235","17-22","17-23","17-24","17-25","17-59","17-233","17-234","17-235","18-22","18-23","18-24","18-25","18-43","18-213","18-214","18-215","18-229","18-230","18-232","18-233","18-234","19-42","19-43"]}
文件2 ( key2:value_list2
)
{"4qld:aaa":["3-245","3-246","4-245","4-246","5-245","5-246","6-243","6-245","6-246","6-247","6-296","7-245","7-295","7-296","8-236","8-239","8-240","8-294","8-295","8-296","9-235","9-236","9-239","9-294","10-293","10-294","10-295","11-15","11-16","11-293","11-294","12-16","12-290","12-291","12-292","12-293","12-294","13-25","13-26","13-27","13-28","13-290","13-292","13-293","14-24","14-25","14-26","14-27","14-290","15-24","15-25","16-24","16-25","16-233","16-234","16-235","17-22","17-23","17-24","17-25","17-59","17-233","17-234","17-235","18-22","18-23","18-24","18-25","18-43","18-213","18-214","18-215","18-229","18-230","18-232","18-233","18-234","19-42","19-43","19-55"]}
文件3 ( key3:value_list3
)
{"6k8h:c":["1-248","2-134","3-245","3-246","4-245","4-246","5-245","5-246","6-243","6-245","6-246","6-247","6-296","7-245","7-295","7-296","8-236","8-239","8-240","8-294","8-295","8-296","9-235","9-236","9-239","9-294","10-293","10-294","10-295","11-15","11-16","11-293","11-294","12-16","12-290","12-291","12-292","12-293","12-294","13-25","13-26","13-27","13-28","13-290","13-292","13-293","14-24","14-25","14-26","14-27","14-290","15-24","15-25","16-24","16-25","16-233","16-234","16-235","17-22","17-23","17-24","17-25","17-59","17-233","17-234","17-235","18-22","18-23","18-24","18-25","18-43","18-213","18-214","18-215","18-229","18-230","18-232","18-233","18-234","19-42","19-43"]}
我想将这些文件合并为一个,它应该如下所示:
{"2mac:acg":["1-248","3-245","3-246","4-245","4-246","5-245","5-246","6-243","6-245","6-246","6-247","6-296","7-245","7-295","7-296","8-236","8-239","8-240","8-294","8-295","8-296","9-235","9-236","9-239","9-294","10-293","10-294","10-295","11-15","11-16","11-293","11-294","12-16","12-290","12-291","12-292","12-293","12-294","13-25","13-26","13-27","13-28","13-290","13-292","13-293","14-24","14-25","14-26","14-27","14-290","15-24","15-25","16-24","16-25","16-233","16-234","16-235","17-22","17-23","17-24","17-25","17-59","17-233","17-234","17-235","18-22","18-23","18-24","18-25","18-43","18-213","18-214","18-215","18-229","18-230","18-232","18-233","18-234","19-42","19-43"], "4qld:aaa":["3-245","3-246","4-245","4-246","5-245","5-246","6-243","6-245","6-246","6-247","6-296","7-245","7-295","7-296","8-236","8-239","8-240","8-294","8-295","8-296","9-235","9-236","9-239","9-294","10-293","10-294","10-295","11-15","11-16","11-293","11-294","12-16","12-290","12-291","12-292","12-293","12-294","13-25","13-26","13-27","13-28","13-290","13-292","13-293","14-24","14-25","14-26","14-27","14-290","15-24","15-25","16-24","16-25","16-233","16-234","16-235","17-22","17-23","17-24","17-25","17-59","17-233","17-234","17-235","18-22","18-23","18-24","18-25","18-43","18-213","18-214","18-215","18-229","18-230","18-232","18-233","18-234","19-42","19-43","19-55"], "6k8h:c":["1-248","2-134","3-245","3-246","4-245","4-246","5-245","5-246","6-243","6-245","6-246","6-247","6-296","7-245","7-295","7-296","8-236","8-239","8-240","8-294","8-295","8-296","9-235","9-236","9-239","9-294","10-293","10-294","10-295","11-15","11-16","11-293","11-294","12-16","12-290","12-291","12-292","12-293","12-294","13-25","13-26","13-27","13-28","13-290","13-292","13-293","14-24","14-25","14-26","14-27","14-290","15-24","15-25","16-24","16-25","16-233","16-234","16-235","17-22","17-23","17-24","17-25","17-59","17-233","17-234","17-235","18-22","18-23","18-24","18-25","18-43","18-213","18-214","18-215","18-229","18-230","18-232","18-233","18-234","19-42","19-43"]}
连接模型应该是{key1:value_list_1, key2:value_list2, key3:value_list3,...,key_last:value_list_last}
感谢@thanasisp,我使用 jq 通过 jq -s 'add' file1 file2 file3 连接它们。当连接数百个文件时它效果很好。但如果有数千个文件,它就不起作用并回复错误消息:参数列表太长!所以我想知道如何解决这个问题以及是否有其他方法来处理它。谢谢! PS:服务器有足够的内存。
答案1
jq -c -s add file*
这会将file*
与该模式匹配的所有文件读取到jq
. -s
( )选项--slurp
导致从所有输入文件创建单个数组。这个大数组的每个元素都是来自其中一个文件的一个对象。数组元素组合在一起add
形成一个对象。
该-c
选项可以jq
产生“紧凑”的输出。
如果文件太多,shell 将由于超出命令行允许的最大长度而无法执行命令。
如果发生这种情况,您可以find
创建 JSON 对象流以供jq
命令处理。
find . -name '*.json' -type f -exec cat {} + | jq -c -s add >final
它使用cat
from从输入文件(名称以当前目录或当前目录结尾的find
任何常规文件)创建 JSON 对象流。.json
该jq
命令将它们收集到一个数组中,然后像以前一样将其组合成一个对象。最终结果输出到文件中final
。
请注意,如果键之间存在冲突(两个或多个文件中的相同键),则找到的最后一个键及其值将覆盖前一个键及其值。
答案2
听起来您甚至不需要 jq,只需将除最后一个文件之外的所有文件中的尾随替换为}
,并删除除第一个文件之外的所有文件中的,
前导。{
在zsh
:
autoload zargs
files=( *.json(Nn) ) # here sorted numerically or:
files=( ${(f)"$(<file.list)"} ) # to read the list one per line from
# a file.list
case $#files in
(1) cat -- $files;;
(<2->)
sed -- 's/}$/,/' $files[1]
zargs -r -- $files[2,-2] -- sed -- 's/^{//; s/}$/,/'
sed -- 's/^{//' $files[-1]
esac > result.json
(如果您希望结果在一行上,请|paste -sd '\0' -
在前面插入)。>
或者首先连接它们并对结果进行替换,这里使用与 ksh(至少是那些内置的变体printf
)、zsh、yash 或 bash 兼容的语法,但假设 GNUxargs
或兼容:
printf '%s\0' "${files[@]}" |
xargs -r0 cat -- |
sed '1!s/^{//; $!s/}$/,/' > result.json
假设输入 json 文件具有正确分隔的行。