如何将第 n 行转置为列？

Question 1

您的数据的静态程度如何？这是作弊吗？

awk '/^[0-9]/ {print $10 "\nsubject code\t\t\t" $1 "\ndate of birth\t\t\t" $2"\nfavorite activities\t\t" $3 "\nheight (m)\t\t\t" $4 "\nweight (lbs)\t\t\t" $5 "\ntest score + standard deviation\t" $6 "\ncolor blind\t\t\t" $7 "\nnumber of siblings\t\t" $8 "\naverage score\t\t\t" $9}' data.txt

Answer

您的数据的静态程度如何？这是作弊吗？

awk '/^[0-9]/ {print $10 "\nsubject code\t\t\t" $1 "\ndate of birth\t\t\t" $2"\nfavorite activities\t\t" $3 "\nheight (m)\t\t\t" $4 "\nweight (lbs)\t\t\t" $5 "\ntest score + standard deviation\t" $6 "\ncolor blind\t\t\t" $7 "\nnumber of siblings\t\t" $8 "\naverage score\t\t\t" $9}' data.txt

Question 2

您可以自动化和通用化输入文件的处理，以允许任何属性名称和任意数量的属性名称，而不仅仅是 OP 中显示的九个：

   "subject code"
   "date of birth"
   "favorite activities"
   "height (m)"
   "weight (lbs)"
   "test score + standard deviation"
   "color blind"
   "number of siblings"
   "average score"

使用 awk 执行此操作的一种方法是：

awk -v i=0 -v nAtt=9 '
    /table columns are:/ {i+=1;ii=nAtt*(i-1);next} 
    !/(^$|^[0-9]+ +([0-9]+\.*[0-9]+ +)*.+)/ {ii+=1;a[ii]=$0}
    /^[0-9]+ +([0-9]+\.*[0-9]+ +)*.+/ 
        {printf "\n%s\n", $(nAtt+1);
        c=0; 
        for(ii-=nAtt;ii++<1+(nAtt-1)*i;) {c++;printf "%-34s%s\n",a[ii],$c}}
                    ' file.txt

解释：

这个 awk 单行语句由三个部分组成，全部以模式匹配开始，如/.../，或反匹配，如!/ .../
但首先在调用时awk，我传递了两个带有选项的外部变量-v：i=0和nAtt=9。 i表示数据块的排名（即属性块在输入文件中的顺序号：file.txt），而nAtt是属性的数量，在该解决方案中每个数据块的属性数量必须相同。
第一次模式匹配：每次awk看到pattern table columns are:，它就知道一个新的数据块从下一条记录开始，它设置两个计数器i和ii，并跳转到下一条记录。
第二个模式匹配：如果正在读取的记录既不是空行，也不是可以匹配以下内容的行： 56 6.18 1307 5.73 167 0.564 2 3 1.7 subject_8293748/label/NMA.label则增加 arraya的计数器ii并开始使用属性名称字符串填充数组，直到达到...
第三种模式匹配：如果记录匹配56 6.18 1307 5.73 167 0.564 2 3 1.7后跟任意数量的字符，则打印(nAtt+1)该记录中的第 th 个字段，然后打印nAtt属性名称和值，每行各一个，这样属性名称左对齐并被截断如果长度超过 34 个字符。

正如之前所暗示的：

这适用于每个属性块的任意数量的属性 (nAtt) 以及块与块之间不同（或不变）的属性名称。
每个块的属性数量不固定是可能的（而且很容易），只需进行少量的脚本修改。
外部变量可以传递到awk，例如：

awk -v i=$i -v nAtt=$nAtt '...'

Answer

您可以自动化和通用化输入文件的处理，以允许任何属性名称和任意数量的属性名称，而不仅仅是 OP 中显示的九个：

   "subject code"
   "date of birth"
   "favorite activities"
   "height (m)"
   "weight (lbs)"
   "test score + standard deviation"
   "color blind"
   "number of siblings"
   "average score"

使用 awk 执行此操作的一种方法是：

awk -v i=0 -v nAtt=9 '
    /table columns are:/ {i+=1;ii=nAtt*(i-1);next} 
    !/(^$|^[0-9]+ +([0-9]+\.*[0-9]+ +)*.+)/ {ii+=1;a[ii]=$0}
    /^[0-9]+ +([0-9]+\.*[0-9]+ +)*.+/ 
        {printf "\n%s\n", $(nAtt+1);
        c=0; 
        for(ii-=nAtt;ii++<1+(nAtt-1)*i;) {c++;printf "%-34s%s\n",a[ii],$c}}
                    ' file.txt

解释：

这个 awk 单行语句由三个部分组成，全部以模式匹配开始，如/.../，或反匹配，如!/ .../
但首先在调用时awk，我传递了两个带有选项的外部变量-v：i=0和nAtt=9。 i表示数据块的排名（即属性块在输入文件中的顺序号：file.txt），而nAtt是属性的数量，在该解决方案中每个数据块的属性数量必须相同。
第一次模式匹配：每次awk看到pattern table columns are:，它就知道一个新的数据块从下一条记录开始，它设置两个计数器i和ii，并跳转到下一条记录。
第二个模式匹配：如果正在读取的记录既不是空行，也不是可以匹配以下内容的行： 56 6.18 1307 5.73 167 0.564 2 3 1.7 subject_8293748/label/NMA.label则增加 arraya的计数器ii并开始使用属性名称字符串填充数组，直到达到...
第三种模式匹配：如果记录匹配56 6.18 1307 5.73 167 0.564 2 3 1.7后跟任意数量的字符，则打印(nAtt+1)该记录中的第 th 个字段，然后打印nAtt属性名称和值，每行各一个，这样属性名称左对齐并被截断如果长度超过 34 个字符。

正如之前所暗示的：

这适用于每个属性块的任意数量的属性 (nAtt) 以及块与块之间不同（或不变）的属性名称。
每个块的属性数量不固定是可能的（而且很容易），只需进行少量的脚本修改。
外部变量可以传递到awk，例如：

awk -v i=$i -v nAtt=$nAtt '...'

Question 3

这是一种方法

awk 'BEGIN {
       a[1]="subject code"
       a[2]="date of birth"
       a[3]="favorite activities"
       a[4]="height (m)"
       a[5]="weight (lbs)"
       a[6]="test score + standard deviation"
       a[7]="color blind"
       a[8]="number of siblings"
       a[9]="average score"
     } {
       print $10
       for(c=0;c++<9;) {
         printf "%-34s%s\n",a[c],$c
       }
     }' file.txt

运行示例

$ cat file.txt
56  6.18  1307  5.73  167  0.564  2  3  1.7  subject_8293748/label/NMA.label
51  3.18  1307  5.73  167  0.564  2  3  1.7  subject_8293755/label/NMA.label
$ awk 'BEGIN {
       a[1]="subject code"
       a[2]="date of birth"
       a[3]="favorite activities"
       a[4]="height (m)"
       a[5]="weight (lbs)"
       a[6]="test score + standard deviation"
       a[7]="color blind"
       a[8]="number of siblings"
       a[9]="average score"
     } {
       print $10
       for(c=0;c++<9;) {
         printf "%-34s%s\n",a[c],$c
       }
     }' file.txt
subject_8293748/label/NMA.label
subject code                      56
date of birth                     6.18
favorite activities               1307
height (m)                        5.73
weight (lbs)                      167
test score + standard deviation   0.564
color blind                       2
number of siblings                3
average score                     1.7
subject_8293755/label/NMA.label
subject code                      51
date of birth                     3.18
favorite activities               1307
height (m)                        5.73
weight (lbs)                      167
test score + standard deviation   0.564
color blind                       2
number of siblings                3
average score                     1.7
$

Answer

这是一种方法

awk 'BEGIN {
       a[1]="subject code"
       a[2]="date of birth"
       a[3]="favorite activities"
       a[4]="height (m)"
       a[5]="weight (lbs)"
       a[6]="test score + standard deviation"
       a[7]="color blind"
       a[8]="number of siblings"
       a[9]="average score"
     } {
       print $10
       for(c=0;c++<9;) {
         printf "%-34s%s\n",a[c],$c
       }
     }' file.txt

运行示例

$ cat file.txt
56  6.18  1307  5.73  167  0.564  2  3  1.7  subject_8293748/label/NMA.label
51  3.18  1307  5.73  167  0.564  2  3  1.7  subject_8293755/label/NMA.label
$ awk 'BEGIN {
       a[1]="subject code"
       a[2]="date of birth"
       a[3]="favorite activities"
       a[4]="height (m)"
       a[5]="weight (lbs)"
       a[6]="test score + standard deviation"
       a[7]="color blind"
       a[8]="number of siblings"
       a[9]="average score"
     } {
       print $10
       for(c=0;c++<9;) {
         printf "%-34s%s\n",a[c],$c
       }
     }' file.txt
subject_8293748/label/NMA.label
subject code                      56
date of birth                     6.18
favorite activities               1307
height (m)                        5.73
weight (lbs)                      167
test score + standard deviation   0.564
color blind                       2
number of siblings                3
average score                     1.7
subject_8293755/label/NMA.label
subject code                      51
date of birth                     3.18
favorite activities               1307
height (m)                        5.73
weight (lbs)                      167
test score + standard deviation   0.564
color blind                       2
number of siblings                3
average score                     1.7
$

如何将第 n 行转置为列？

答案1

答案2

答案3

相关内容