输入:
注意:2 列由制表符分隔,常规空格分隔第 2 列中的单词。
1 the mouse is dead
2 hit the wall
3 winter lasts forever
想要的输出:
1 the
1 mouse
1 is
1 dead
2 hit
2 the
2 wall
3 winter
3 lasts
3 forever
awk
这是要走的路吗?
答案1
好吧,第一个字段是$1
,NF
保存了该行的字段数量,我们可以使用$i
wherei
是变量来访问字段,并且循环的工作方式几乎与 C 中一样。所以:
$ awk '{for (i = 2; i <= NF; i++) printf "%s\t%s\n", $1, $i} ' < blah
1 the
1 mouse
...
(这不会区分作为字段分隔符的空格和制表符。)
答案2
使用 GNU sed
:
sed -E 's/^((\S+\s+)\S+)\s+/&\n\2/;P;D'
POSIX 语法更丑陋sed
:
s='[[:space:]]\{1,\}' S='[^[:space:]]\{1,\}'
sed "s/^\(\($S$s\)$S\)$s/&\\
\2/;P;D"
答案3
另一个 awk :
~$>echo '1 the mouse is dead
2 hit the wall
3 winter lasts forever
' | awk 'BEGIN { RS="[[:space:]]+"; } /^[[:digit:]]+$/ {line=$1; next}; { print line "\t" $1; }'
1 the
1 mouse
1 is
1 dead
2 hit
2 the
2 wall
3 winter
3 lasts
3 forever
并且布置得稍微好一些..
# split all parts into single word records.
BEGIN { RS="[[:space:]]+"; }
# if the record is a number the save
/^[[:digit:]]+$/ { line=$1; next };
# else use last saved line number and this record to format output.
{ print line "\t" $1; }
答案4
您还可以将 split 函数与 awk 一起使用:
awk -F"\t" 'BEGIN { OFS="\t" } { cols=split($2,arr," "); for ( i=1; i<=cols; i++ ) { print $1,arr[i] }}'