我目前正在开发一个课程,让我们将代码提交给自动评分器,然后返回我们的结果。它返回的格式有点难以直观地解析,所以我想编写一个可以在管道中使用的脚本,以使其更易于阅读。
这是自动评分器的输出:
Problem,Correct?,Correct Answer,Agent's Answer
"Challenge Problem B-04",0,4,-1
"Basic Problem B-12",0,1,-1
"Challenge Problem B-05",0,6,-1
"Challenge Problem B-07",0,6,-1
"Challenge Problem B-06",0,3,-1
"Basic Problem B-11",0,1,-1
"Basic Problem B-10",0,3,-1
"Challenge Problem B-03",0,3,-1
"Challenge Problem B-02",0,1,-1
"Challenge Problem B-01",0,6,-1
"Challenge Problem B-09",0,4,-1
"Challenge Problem B-08",0,4,-1
"Basic Problem B-08",0,6,-1
"Basic Problem B-09",0,5,-1
"Basic Problem B-04",0,3,-1
"Basic Problem B-05",0,4,-1
"Basic Problem B-06",0,5,-1
"Basic Problem B-07",0,6,-1
"Basic Problem B-01",0,2,-1
"Basic Problem B-02",0,5,-1
"Basic Problem B-03",0,1,-1
"Challenge Problem B-10",0,4,-1
"Challenge Problem B-11",0,5,-1
"Challenge Problem B-12",0,1,-1
{
"Basic Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Basic Problems B"
},
"Challenge Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Challenge Problems B"
}
}
它是逗号分隔值和 JSON 的混合。如果能将所有这些内容放在一个我可以阅读的漂亮表格中就好了。
目前,我有类似的东西
python submit.py --provider gt --assignment error-check | column -t -s, | less -S
哪个输出:
{
"Basic Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Basic Problems B"
},
"Challenge Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Challenge Problems B"
}
}
Problem Correct? Correct Answer Agent's Answer
"Challenge Problem B-04" 0 4 -1
"Basic Problem B-12" 0 1 -1
"Challenge Problem B-05" 0 6 -1
"Challenge Problem B-07" 0 6 -1
"Challenge Problem B-06" 0 3 -1
"Basic Problem B-11" 0 1 -1
"Basic Problem B-10" 0 3 -1
"Challenge Problem B-03" 0 3 -1
"Challenge Problem B-02" 0 1 -1
"Challenge Problem B-01" 0 6 -1
"Challenge Problem B-09" 0 4 -1
"Challenge Problem B-08" 0 4 -1
"Basic Problem B-08" 0 6 -1
"Basic Problem B-09" 0 5 -1
"Basic Problem B-04" 0 3 -1
"Basic Problem B-05" 0 4 -1
"Basic Problem B-06" 0 5 -1
"Basic Problem B-07" 0 6 -1
"Basic Problem B-01" 0 2 -1
"Basic Problem B-02" 0 5 -1
"Basic Problem B-03" 0 1 -1
"Challenge Problem B-10" 0 4 -1
"Challenge Problem B-11" 0 5 -1
"Challenge Problem B-12" 0 1 -1
这让我大部分时间都到了那里。现在我想知道是否有一种方法可以处理 JSON?
我不能依赖于在某个行号处分割输出,但我想我可以在第一次找到{
.
我想尽可能少地做这件事,这样我就可以与同学分享。所以依赖越少越好。
我见过其他 JSON 解析帖子建议使用外部代码。
理想的输出如下所示:
Problem Correct? Correct Answer Agent's Answer
"Challenge Problem B-04" 0 4 -1
"Basic Problem B-12" 0 1 -1
"Challenge Problem B-05" 0 6 -1
"Challenge Problem B-07" 0 6 -1
"Challenge Problem B-06" 0 3 -1
"Basic Problem B-11" 0 1 -1
"Basic Problem B-10" 0 3 -1
"Challenge Problem B-03" 0 3 -1
"Challenge Problem B-02" 0 1 -1
"Challenge Problem B-01" 0 6 -1
"Challenge Problem B-09" 0 4 -1
"Challenge Problem B-08" 0 4 -1
"Basic Problem B-08" 0 6 -1
"Basic Problem B-09" 0 5 -1
"Basic Problem B-04" 0 3 -1
"Basic Problem B-05" 0 4 -1
"Basic Problem B-06" 0 5 -1
"Basic Problem B-07" 0 6 -1
"Basic Problem B-01" 0 2 -1
"Basic Problem B-02" 0 5 -1
"Basic Problem B-03" 0 1 -1
"Challenge Problem B-10" 0 4 -1
"Challenge Problem B-11" 0 5 -1
"Challenge Problem B-12" 0 1 -1
Set Incorrect Skipped Correct
Basic Problems B 0 12 0
Challenge Problems B 0 12 0
答案1
将 JSON 与其余部分分开非常容易。这只会给你非 JSON:
python submit.py --provider gt --assignment error-check | sed '/{/,$d'
而这,只有 JSON:
python submit.py --provider gt --assignment error-check | sed -n '/{/,$p'
为了说明这一点,我将您的示例输入保存为file
和:
$ sed '/{/,$d' file
Problem,Correct?,Correct Answer,Agent's Answer
"Challenge Problem B-04",0,4,-1
"Basic Problem B-12",0,1,-1
"Challenge Problem B-05",0,6,-1
"Challenge Problem B-07",0,6,-1
"Challenge Problem B-06",0,3,-1
"Basic Problem B-11",0,1,-1
"Basic Problem B-10",0,3,-1
"Challenge Problem B-03",0,3,-1
"Challenge Problem B-02",0,1,-1
"Challenge Problem B-01",0,6,-1
"Challenge Problem B-09",0,4,-1
"Challenge Problem B-08",0,4,-1
"Basic Problem B-08",0,6,-1
"Basic Problem B-09",0,5,-1
"Basic Problem B-04",0,3,-1
"Basic Problem B-05",0,4,-1
"Basic Problem B-06",0,5,-1
"Basic Problem B-07",0,6,-1
"Basic Problem B-01",0,2,-1
"Basic Problem B-02",0,5,-1
"Basic Problem B-03",0,1,-1
"Challenge Problem B-10",0,4,-1
"Challenge Problem B-11",0,5,-1
"Challenge Problem B-12",0,1,-1
和
$ sed -n '/{/,$p' file
{
"Basic Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Basic Problems B"
},
"Challenge Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Challenge Problems B"
}
}
现在,您已经很好地处理了非 JSON,所以我不会更改它。理想情况下,应使用 JSON 解析器来解析 JSON 数据,例如jq
.可悲的是,我不知道jq
如何正确地做到这一点,所以我能想到的最好的办法就是这个相当不优雅的解决方案。至少它确实做了你想要的事情(cat file
用你的python submit.py --provider gt --assignment error-check
命令替换:
$ cat file | sed -n 's/[,"]//g; s/^ *//; /{/,$p' | tac | awk -F': ' 'BEGIN{printf "%-30s%-10s%-10s%-10s\n", "Set", "Incorrect", "Skipped", "Correct"} NF==2 && !/\{/{if($1=="Set"){set=$2;data[set]["Incorrect"] = 0;data[set]["Skipped"] = 0;data[set]["Correct"] = 0;} data[set][$1]=$2}END{for(set in data){printf "%-30s%-10s%-10s%-10s\n", set,data[set]["Incorrect"],data[set]["Skipped"],data[set]["Correct"]}}'
Set Incorrect Skipped Correct
Challenge Problems B 0 12 0
Basic Problems B 0 12 0
将所有这些放在一个 shell 脚本中给出:
#!/bin/bash
tmpFile=$(mktemp)
python submit.py --provider gt --assignment error-check > "$tmpFile";
sed '/{/,$d' "$tmpFile" | column -t -s,
sed -n 's/[,"]//g; s/^ *//; /{/,$p' "$tmpFile" |
tac |
awk -F': ' '
BEGIN{
printf "%-30s%-10s%-10s%-10s\n", "Set", "Incorrect", "Skipped", "Correct"
}
NF==2 && !/\{/{
if($1=="Set"){
set=$2;
data[set]["Incorrect"] = 0;
data[set]["Skipped"] = 0;
data[set]["Correct"] = 0;
}
data[set][$1]=$2
}
END{
for(set in data){
printf "%-30s%-10s%-10s%-10s\n", set,
data[set]["Incorrect"],
data[set]["Skipped"],
data[set]["Correct"]}
}'
rm "$tmpFile"
产生以下输出:
$ foo.sh
Problem Correct? Correct Answer Agent's Answer
"Challenge Problem B-04" 0 4 -1
"Basic Problem B-12" 0 1 -1
"Challenge Problem B-05" 0 6 -1
"Challenge Problem B-07" 0 6 -1
"Challenge Problem B-06" 0 3 -1
"Basic Problem B-11" 0 1 -1
"Basic Problem B-10" 0 3 -1
"Challenge Problem B-03" 0 3 -1
"Challenge Problem B-02" 0 1 -1
"Challenge Problem B-01" 0 6 -1
"Challenge Problem B-09" 0 4 -1
"Challenge Problem B-08" 0 4 -1
"Basic Problem B-08" 0 6 -1
"Basic Problem B-09" 0 5 -1
"Basic Problem B-04" 0 3 -1
"Basic Problem B-05" 0 4 -1
"Basic Problem B-06" 0 5 -1
"Basic Problem B-07" 0 6 -1
"Basic Problem B-01" 0 2 -1
"Basic Problem B-02" 0 5 -1
"Basic Problem B-03" 0 1 -1
"Challenge Problem B-10" 0 4 -1
"Challenge Problem B-11" 0 5 -1
"Challenge Problem B-12" 0 1 -1
Set Incorrect Skipped Correct
Challenge Problems B 0 12 0
Basic Problems B 0 12 0
虽然感觉很hacky,但我希望有人能用专用的 JSON 解析器提出一个更干净的解决方案。
钢驱动器很高兴jq
在评论中给出了正确的解决方案,因此如果我们将其纳入其中,我们会得到更简单(也更安全)的解决方案:
#!/bin/bash
tmpFile=$(mktemp)
python submit.py --provider gt --assignment error-check > "$tmpFile";
sed '/{/,$d' "$tmpFile" | column -t -s,
sed -n '/{/,$p' "$tmpFile" |
jq -r '["Set","Incorrect","Skipped","Correct"], (.[] | [.Set,.Incorrect,.Skipped,.Correct]) | @tsv'
rm "$tmpFile"
答案2
使用米勒(https://github.com/johnkerl/miller)并运行
# get the CSV and transform it into a pretty print table
<input grep -P '^("|\w)' | mlr --c2p cat >out
# add a carriage return
echo "" >> out
# convert the json into a pretty print table and add it to the output
<input grep -vP '^("|\w)' | mlr --j2p cat -n then reshape -r "(Basi|Chal)" -o i,v \
then nest --explode --values --across-fields --nested-fs ":" -f i \
then reshape -s i_2,v \
then cut -x -f i_1,n \
then reorder -f Set >>out
你将会拥有
Problem Correct? Correct Answer Agent's Answer
Challenge Problem B-04 0 4 -1
Basic Problem B-12 0 1 -1
Challenge Problem B-05 0 6 -1
Challenge Problem B-07 0 6 -1
Challenge Problem B-06 0 3 -1
Basic Problem B-11 0 1 -1
Basic Problem B-10 0 3 -1
Challenge Problem B-03 0 3 -1
Challenge Problem B-02 0 1 -1
Challenge Problem B-01 0 6 -1
Challenge Problem B-09 0 4 -1
Challenge Problem B-08 0 4 -1
Basic Problem B-08 0 6 -1
Basic Problem B-09 0 5 -1
Basic Problem B-04 0 3 -1
Basic Problem B-05 0 4 -1
Basic Problem B-06 0 5 -1
Basic Problem B-07 0 6 -1
Basic Problem B-01 0 2 -1
Basic Problem B-02 0 5 -1
Basic Problem B-03 0 1 -1
Challenge Problem B-10 0 4 -1
Challenge Problem B-11 0 5 -1
Challenge Problem B-12 0 1 -1
Set Incorrect Skipped Correct
Basic Problems B 0 12 0
Challenge Problems B 0 12 0
答案3
$ cat tst.awk
BEGIN { FS=","; OFS="\t" }
/{/ { FS="(^|\":)[[:space:]]+\"|\",?" }
FS == "," { $1=$1; print; next }
{ f[$2] = $3 }
/}/ {
if ( !doneHdr++ ) {
print "Set", "Incorrect", "Skipped", "Correct"
}
print f["Set"], f["Incorrect"], f["Skipped"], f["Correct"]
}
。
$ awk -f tst.awk file | column -s$'\t' -t
Problem Correct? Correct Answer Agent's Answer
"Challenge Problem B-04" 0 4 -1
"Basic Problem B-12" 0 1 -1
"Challenge Problem B-05" 0 6 -1
"Challenge Problem B-07" 0 6 -1
"Challenge Problem B-06" 0 3 -1
"Basic Problem B-11" 0 1 -1
"Basic Problem B-10" 0 3 -1
"Challenge Problem B-03" 0 3 -1
"Challenge Problem B-02" 0 1 -1
"Challenge Problem B-01" 0 6 -1
"Challenge Problem B-09" 0 4 -1
"Challenge Problem B-08" 0 4 -1
"Basic Problem B-08" 0 6 -1
"Basic Problem B-09" 0 5 -1
"Basic Problem B-04" 0 3 -1
"Basic Problem B-05" 0 4 -1
"Basic Problem B-06" 0 5 -1
"Basic Problem B-07" 0 6 -1
"Basic Problem B-01" 0 2 -1
"Basic Problem B-02" 0 5 -1
"Basic Problem B-03" 0 1 -1
"Challenge Problem B-10" 0 4 -1
"Challenge Problem B-11" 0 5 -1
"Challenge Problem B-12" 0 1 -1
Set Incorrect Skipped Correct
Basic Problems B 0 12 0
Challenge Problems B 0 12 0
Challenge Problems B 0 12 0