如何使用 awk 连接多行列

如何使用 awk 连接多行列

我有如下 csv

col1,col2,col3,col4,col5

1,val1,57,val1,TRUE
,val2,,val2,    
,val3,,val3,    
,val4,,val4,    
,val5,,val5,    
2,val1,878,val1,FALSE
,val2,,val2,    
,val3,,val3,    
,val4,,val4,    
,val5,,val5,

我需要使用 awk 显示输出,如下所示

col1,col2,col3,col4,col5

1,val1#val2#val3#val4#val5,57,val1#val2#val3#val4#val5,TRUE
2,val1#val2#val3#val4#val5,878,val1#val2#val3#val4#val5,FALSE

答案1

保持简单、可读和可移植(主要是我没有那么多awk经验,呵呵):

BEGIN { FS=","; OFS="," }

NR < 3 { 
    print  # just echo header and separator lines
}

/^[0-9]/ { 
    if (NR > 3) {
        # concatenate all parts (note: csv because of OFS not the commas here)
        print part1,part2,part3
    }

    part1 = $1 OFS $2
    part2 = $3 OFS $4
    part3 = $5
}

/^,/ {
    part1 = part1 "#" $2
    part2 = part2 "#" $4
}

END { print part1,part2,part3 }

结果:

col1,col2,col3,col4,col5

1,val1#val2#val3#val4#val5,57,val1#val2#val3#val4#val5,TRUE
2,val1#val2#val3#val4#val5,878,val1#val2#val3#val4#val5,FALSE

答案2

复杂的awk解决方案任何字段数量:

awk 'function pr(a, len){    # print an integral(joined) line
         for (i=1; i<=len; i++) printf "%s%s",a[i],(i==len? ORS : ",") 
     }
     NR<3;{ f=($0~/^[0-9]+,/? 1:0) }  # `f` flag, points to a `basic` line(starting with digit)
     NR>2{ 
         if (f && a[1]){ pr(a, len) } 
         len=(f? split($0,a,",") : split($0,b,","));  # split into different arrays
         if (!f) {                                    # encountering subsequent line
             for(i=2; i<=len; i++) 
                 if(b[i]!="") a[i]=a[i]"#"b[i]        # append subsequent values to `basic` line
         } 
     }END{ pr(a, len) }' file

输出:

col1,col2,col3,col4,col5

1,val1#val2#val3#val4#val5,57,val1#val2#val3#val4#val5,TRUE
2,val1#val2#val3#val4#val5,878,val1#val2#val3#val4#val5,FALSE

相关内容