使用 awk 从 csv 中删除特定列

使用 awk 从 csv 中删除特定列

我能够以 csv 格式获取列,如下所示:

,列1,列2,列3,列4,列5,,

我使用 awk 命令来获取以下格式的输出:

awk -vORS=, '$0 && p {print $2}; $2 == "name" {p=1} '`

然后我使用以下两个命令删除前导和尾随两个逗号:

   cols=${cols:1}
   cols=${cols:0:${#cols}-2}

现在我得到以下格式的输出:

列 1,列 2,列 3,列 4,列 5

我想从右侧删除与列表匹配的特定列。例如,如果我使用参数“col4,col5”调用该函数,awk 应该删除最后两列并打印输出,如下所示:

列1,列2,列3

如何在 shell 脚本中完成此操作(最好使用 awk 或 grep 或其他一些此类 shell 支持的命令)?

更新:初始文件内容以表格形式输出,如下所示:

+-----------------------------------------+--------+---------+
| name                                    | type   | comment |
+-----------------------------------------+--------+---------+
| col1                                    | int    |         |
| col2                                    | int    |         |
| col3                                    | string |         |
| col4                                    | string |         |
| col5                                    | string |         |
+-----------------------------------------+--------+---------+

答案1

您可以使用“剪切”从分隔数据中提取某些列。例如,下面提取最后两列:

echo col1,col2,col3,col4,col5 | cut -d , -f 4,5

印刷

col4,col5

-d 参数指定分隔符,-f 指定您希望在结果中出现的结果字段的索引或索引范围

编辑

为了使其更加动态,下面将根据 Y 分隔符选择最后 X 列:

function lastCols {
        endcol=$(($(head -n 1 $1 | grep -o , | wc -l) + 1))
        startcol=$(($endcol-$2+1))
        cut -d $3 -f $startcol-$endcol < $1
}

lastCols $1 $2 $3

我对此没有做过太多测试,所以可能有点问题。使用如下:

[]$ cat temp.txt
col1,col2,col3,col4,col5
col1,col2,col3,col4,col5
col1,col2,col3,col4,col5
col1,col2,col3,col4,col5
col1,col2,col3,col4,col5
col1,col2,col3,col4,col5
col1,col2,col3,col4,col5
col1,col2,col3,col4,col5
col1,col2,col3,col4,col5

[]$ ./lastCols.sh temp.txt 2 ,
col4,col5
col4,col5
col4,col5
col4,col5
col4,col5
col4,col5
col4,col5
col4,col5
col4,col5

答案2

这是我几年前为了解决这个问题而写的,当时我经常从事 OpenStack 工作,并对 OpenStack 工具的难以解析的输出感到恼火:

#! /usr/bin/perl

# untable.pl
#
# Copyright (C) 2012, 2013 Craig Sanders <[email protected]>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2, or (at your option)
# any later version.

# script to strip mysql-style table formatting from nova, keystone,
# glance, etc commands
#
# also works for any tables output from mysql, and from tables produced
# by 'links -dump'
#
# makes the output easily parsable and usable in other scripts.
#
# TODO: command-line option to allow forcing of output style (2-column
# or multi-column) rather than detection.


use strict;

use Carp;
use Getopt::Long;

my $print_headers=0;
my $separator = '';
my $tab = '';
my $result = GetOptions("headers!"  => \$print_headers,
                        "separator=s" => \$separator,
                        "tab" => \$tab,
                       );
$separator = "\t" if ($tab);

my $propval = -1;
our @headers;

while(<>) {
  chomp;
  next if (m/^\+/);

  s/^\|\s*|\s*\|$//iog;  # this / is here to fix SE''s broken perl syntax highlighting.

  my @columns = split '\|';
  # strip leading and trailing spaces
  for my $col (0..scalar @columns-1) {
    if ($columns[$col] eq '') {;
      delete $columns[$col];
    } else {
      $columns[$col] =~ s/^\s+|\s+$//iog;
    };
  }

  # find type of table - 2-column Property/Value, or multi-column
  if ($propval == -1) {
    if ($columns[0] eq 'Property') {
      $propval = 1 ;
      $separator = ": " if ($separator eq '');  # default to ': ' unless specified on cmd line
    } else {
      $propval = 0;
      $separator = "\t" if ($separator eq '');  # default to TAB unless specified on cmd line
      @headers = @columns;
      print (join($separator,@headers),"\n") if $print_headers ;
    };
    next;
  } else {
    print join($separator,@columns),"\n" if (defined $columns[1]);    # skip line unless we have more than one column to output
  }
}

例子:

两列:

$ keystone tenant-get 93c14424ed06494c832457d974b9505e
+-------------+-----------------------------------------+
|   Property  |                  Value                  |
+-------------+-----------------------------------------+
| description | Anonymous Tenant Description            |
| enabled     | True                                    |
| id          | 93c14424ed06494c832457d974b9505e        |
| name        | ANON1                                   |
+-------------+-----------------------------------------+

$ keystone tenant-get 93c14424ed06494c832457d974b9505e | ./untable.pl
description: Anonymous Tenant Description
enabled: True
id: 93c14424ed06494c832457d974b9505e
name: ANON1

多栏:

$ keystone user-list 810
+-----+---------+-----------------------------+-----------------------------+
|  id | enabled |            email            |             name            |
+-----+---------+-----------------------------+-----------------------------+
| 414 | 1       | [email protected]    | [email protected]    |
| 500 | 1       | [email protected]    | [email protected]    |
| 610 | 1       | [email protected]    | [email protected]    |
| 729 | 1       | [email protected]    | [email protected]    |
+-----+---------+-----------------------------+-----------------------------+

$ keystone user-list 810 | ./untable.pl
414     1       [email protected]    [email protected]
500     1       [email protected]    [email protected]
610     1       [email protected]    [email protected]
729     1       [email protected]    [email protected]

相关内容