我正在尝试从 csv 中列出的注释中提取一个作业编号,然后将该编号附加到行尾,最好通过 sed、awk、grep 或 perl(我已经在 Cygwin 中安装了)。
以下是一个模型:
"HD1231203", "1231232","fake.name","Lots of text before the job 150232 and then more"
"HD5164635", "8918123","more.fake","151243 and then some text"
"HD1541545", "8435413","last.fake","Oh look, we've got 150213 and 151487 this time!"
应该变成:
"HD1231203", "1231232","fake.name","Lots of text before the job 150232 and then more", "150232"
"HD5164635", "8918123","more.fake","151243 and then some text","151243"
"HD1541545", "8435413","last.fake","Oh look, we've got 150213 and 151487 this time!","150213","151487"
我已经尝试了我所了解的有关 sed 的一些知识,但老实说,我还不够深入。
答案1
简单的 Perl 解决方案:
perl -F, -lape '$_ .= qq(,"$1") while $F[-1] =~ /([0-9]+)/g' FILE
-F,
按逗号分隔(如果逗号在数字后的双引号内,可能会中断,见下文)。如果最后一个字段中有数字,则它们将添加到当前行。
为了正确解决这个问题,你应该用 Perl 的文本::CSV模块。
#!/usr/bin/perl
use warnings;
use strict;
use Text::CSV;
my $csv = 'Text::CSV'->new({ always_quote => 1,
allow_whitespace => 1,
eol => "\n",
}) or die 'Text::CSV'->error_diag;
open my $IN, '<', shift or die $!;
while (my $row = $csv->getline($IN)) {
my @new;
push @new, $1 while $row->[-1] =~ /([0-9]+)/g;
$csv->print(*STDOUT, [@$row, @new]);
}
$csv->eof or $csv->error_diag;