模式匹配后连接多行

模式匹配后连接多行

我有一个包含数据的文件,如下所示:

STUDENT DETAILS
NAME MARKS STD
XYZ 20 I
RANK SCHOOL TEACHER GRADE
5 TTT ANON B
POSITION
5
STUDENT DETAILS
NAME MARKS STD
ABC              40                I
RANK SCHOOL TEACHER GRADE
5 TTT ANON A
POSITION
5

我希望我的输出为:

NAME MARKS STD RANK SCHOOL TEACHER GRADE POSITION
XYZ  20     I   5    TTT   ANON    B     5
ABC  40     I   5    TTT   ANON    A     5

STUDENT DETAILS我尝试通过找到图案awk并打印第二、第五和第八行。但线路需要连接起来。

我跑:

awk '/STUDENT DETAILS/{nr[NR];nr[NR+2]; nr[NR+5]; nr[NR+8]}; END {for (i in nr) print nr[i]}' file.txt > filenew.txt

我怎样才能实现这个目标?

答案1

你不能在 awk 中预读,你必须记住模式

awk 文件(下面是 u.awk)

/STUDENT/ { li=NR;}
NR == li+2 { mark[li]=$0 }
NR == li+4 { pos[li]=$0 }
END { for (m in mark) printf "%s %s\n",mark[m],pos[m] ;}

在哪里

  • /STUDENT/ { li=NR;}记住记录开始的行
  • NR == li+2 { mark[li]=$0 }当当前行为+2时,记住标记(位置也一样)

当使用示例数据运行时(我删除空行,如果实际文件中存在则调整+2/+4),这给出

awk -f u.awk liste-1.txt

XYZ 20 I 5 TTT ANON B
ABC              40                I 5 TTT ANON A

省略了标头的生成。

答案2

如果您将数据预先分割成记录,则可以不必打印相关字段:

# Pre-splitting
sed '/^STUDENT/ { 1!s/^/\n/; }' infile |

# Reorder the record:
awk -v RS= -v FS='\n' '
  NR == 1 { print $2, $4, $6 }
          { print $3, $5, $7 }'        |

# Pretty-print columns
column -t

输出:

NAME  MARKS  STD  RANK  SCHOOL  TEACHER  GRADE  POSITION
XYZ   20     I    5     TTT     ANON     B      5
ABC   40     I    5     TTT     ANON     A      5

答案3

awk '
    BEGIN { OFS="\t"; maxLines=7 }
    { lineNr=(NR-1) % maxLines + 1; $1=$1; lines[lineNr]=$0 }
    NR == maxLines     { print lines[2], lines[4], lines[6] }
    lineNr == maxLines { print lines[3], lines[5], lines[7] }
' file
NAME    MARKS   STD     RANK    SCHOOL  TEACHER GRADE   POSITION
XYZ     20      I       5       TTT     ANON    B       5
ABC     40      I       5       TTT     ANON    A       5

答案4

Tested with below script and it worked fine


STEP1:

header=`sed '/STUDENT/d' r.txt |sed -n '1~2p'| sort | uniq| sed "N;s/\n/ /g"| sed "N;s/\n/ /g"`

count=`sed '/STUDENT/d'  o.txt|wc -l`
sed -i '/STUDENT/d' o.txt

STEP2:
for ((i=1;i<=$count;i++)); do j=$(($i+5)); sed -n ""$i","$j"p" o.txt| sed -n '2~2p'|sed -r "s/\s+/ /g"|sed "N;s/\n/ /g"|sed "N;s/\n/ /g"; i=$j; done| awk -v header="$header" 'BEGIN{print header}{print $0}'| sed "s/ /\t/g"

相关内容