如何使用datamash对所有列进行操作？

Question 1

我没有看到指定未知范围的选项数据混合手册

试试这个perl单线

$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
1332 1665 1998

-a选项将自动在空格上分割输入行，结果保存在@F数组中
for 0..$#F循环数组，$#F给出最后一个元素的索引
$s[$_]+=$F[$_]将总和保存在@s数组中，默认情况下初始值将0在数字上下文中。$_每次迭代都会有索引值
END{print join " ", @s}处理完所有输入行后，@s以空格作为分隔符打印数组内容

Answer

我没有看到指定未知范围的选项数据混合手册

试试这个perl单线

$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
1332 1665 1998

-a选项将自动在空格上分割输入行，结果保存在@F数组中
for 0..$#F循环数组，$#F给出最后一个元素的索引
$s[$_]+=$F[$_]将总和保存在@s数组中，默认情况下初始值将0在数字上下文中。$_每次迭代都会有索引值
END{print join " ", @s}处理完所有输入行后，@s以空格作为分隔符打印数组内容

Question 2

cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t\  sum 1-$cols

或者

cat foo | datamash -t\  sum 1-$( awk '{print NF; exit}' foo)

datamash有指定列范围的功能，因此计算列数并将该结果用作范围规范的一部分。在我的示例解决方案中，我过去awk只检查文件的第一行并退出，但您可以使用适合您喜欢的任何其他内容。datamash本身有一个-check函数，其输出包括列数，但其格式仍需要解析您感兴趣的特定数字。

Answer

cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t\  sum 1-$cols

或者

cat foo | datamash -t\  sum 1-$( awk '{print NF; exit}' foo)

datamash有指定列范围的功能，因此计算列数并将该结果用作范围规范的一部分。在我的示例解决方案中，我过去awk只检查文件的第一行并退出，但您可以使用适合您喜欢的任何其他内容。datamash本身有一个-check函数，其输出包括列数，但其格式仍需要解析您感兴趣的特定数字。

Question 3

我不知道datamash，但这里有一个awk解决方案：

$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "\n" } ' input
1332 1665 1998

为了使该awk脚本更具可读性：

{      // execute on all records
  for( col=1; col<=NF; col++ ) { 
    totals[col]+=$col 
  }; 
} 
END {  // execute after all records processed
  for( col=0; col<length(totals); col++ ) {
    printf "%s ", totals[col]
  }; 
  printf "\n";
}

Answer

我不知道datamash，但这里有一个awk解决方案：

$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "\n" } ' input
1332 1665 1998

为了使该awk脚本更具可读性：

{      // execute on all records
  for( col=1; col<=NF; col++ ) { 
    totals[col]+=$col 
  }; 
} 
END {  // execute after all records processed
  for( col=0; col<length(totals); col++ ) {
    printf "%s ", totals[col]
  }; 
  printf "\n";
}

Question 4

使用datamash和bash：

n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo

输出：

1332    1665    1998

怎么运行的：

datamash -W check < foo输出字符串“3行，3个字段”。
n=($(datamash -W check < foo))将该字符串加载到数组中$n。我们想要字段的数量，即${n[2]}。
datamash -W sum 1-${n[2]} < foo剩下的就完成了。

这也可以通过POSIXshell，使用复杂的printf格式化字符串而不是数组，但它更粗糙：

datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo

也可以使用 shell 工具来完成：

datamash -W sum 1-$(head -1 foo | wc -w) < foo

Answer

使用datamash和bash：

n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo

输出：

1332    1665    1998

怎么运行的：

datamash -W check < foo输出字符串“3行，3个字段”。
n=($(datamash -W check < foo))将该字符串加载到数组中$n。我们想要字段的数量，即${n[2]}。
datamash -W sum 1-${n[2]} < foo剩下的就完成了。

这也可以通过POSIXshell，使用复杂的printf格式化字符串而不是数组，但它更粗糙：

datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo

也可以使用 shell 工具来完成：

datamash -W sum 1-$(head -1 foo | wc -w) < foo

如何使用datamash对所有列进行操作？

答案1

答案2

答案3

答案4

相关内容