我有一个包含数据的文件,如下所示:
STUDENT DETAILS
NAME MARKS STD
XYZ 20 I
RANK SCHOOL TEACHER GRADE
5 TTT ANON B
POSITION
5
STUDENT DETAILS
NAME MARKS STD
ABC 40 I
RANK SCHOOL TEACHER GRADE
5 TTT ANON A
POSITION
5
我希望我的输出为:
NAME MARKS STD RANK SCHOOL TEACHER GRADE POSITION
XYZ 20 I 5 TTT ANON B 5
ABC 40 I 5 TTT ANON A 5
STUDENT DETAILS
我尝试通过找到图案awk
并打印第二、第五和第八行。但线路需要连接起来。
我跑:
awk '/STUDENT DETAILS/{nr[NR];nr[NR+2]; nr[NR+5]; nr[NR+8]}; END {for (i in nr) print nr[i]}' file.txt > filenew.txt
我怎样才能实现这个目标?
答案1
你不能在 awk 中预读,你必须记住模式
awk 文件(下面是 u.awk)
/STUDENT/ { li=NR;}
NR == li+2 { mark[li]=$0 }
NR == li+4 { pos[li]=$0 }
END { for (m in mark) printf "%s %s\n",mark[m],pos[m] ;}
在哪里
/STUDENT/ { li=NR;}
记住记录开始的行NR == li+2 { mark[li]=$0 }
当当前行为+2时,记住标记(位置也一样)
当使用示例数据运行时(我删除空行,如果实际文件中存在则调整+2/+4),这给出
awk -f u.awk liste-1.txt
XYZ 20 I 5 TTT ANON B
ABC 40 I 5 TTT ANON A
省略了标头的生成。
答案2
如果您将数据预先分割成记录,则可以不必打印相关字段:
# Pre-splitting
sed '/^STUDENT/ { 1!s/^/\n/; }' infile |
# Reorder the record:
awk -v RS= -v FS='\n' '
NR == 1 { print $2, $4, $6 }
{ print $3, $5, $7 }' |
# Pretty-print columns
column -t
输出:
NAME MARKS STD RANK SCHOOL TEACHER GRADE POSITION
XYZ 20 I 5 TTT ANON B 5
ABC 40 I 5 TTT ANON A 5
答案3
awk '
BEGIN { OFS="\t"; maxLines=7 }
{ lineNr=(NR-1) % maxLines + 1; $1=$1; lines[lineNr]=$0 }
NR == maxLines { print lines[2], lines[4], lines[6] }
lineNr == maxLines { print lines[3], lines[5], lines[7] }
' file
NAME MARKS STD RANK SCHOOL TEACHER GRADE POSITION
XYZ 20 I 5 TTT ANON B 5
ABC 40 I 5 TTT ANON A 5
答案4
Tested with below script and it worked fine
STEP1:
header=`sed '/STUDENT/d' r.txt |sed -n '1~2p'| sort | uniq| sed "N;s/\n/ /g"| sed "N;s/\n/ /g"`
count=`sed '/STUDENT/d' o.txt|wc -l`
sed -i '/STUDENT/d' o.txt
STEP2:
for ((i=1;i<=$count;i++)); do j=$(($i+5)); sed -n ""$i","$j"p" o.txt| sed -n '2~2p'|sed -r "s/\s+/ /g"|sed "N;s/\n/ /g"|sed "N;s/\n/ /g"; i=$j; done| awk -v header="$header" 'BEGIN{print header}{print $0}'| sed "s/ /\t/g"