我有一系列名为“dir000”、“dir001”等的 297 个目录,每个目录都包含一个名为“config”的文本文件,该文件是一个 3 列和 256 行的 csv 文件。我生成了 25 个随机数在 1 到 256 的范围内,并且从每个目录中的所有这些文件中,我需要删除确切的 25 行,例如,如果我的生成器给了我一系列随机数a = [145,11,140,119,183,178,225,131,1,65,213,115,207,41,194,221,10,205,6,57,224,108,44,85,211]
,我想从每个目录中删除这些行。每个目录中的 ASCII 文件(“config”)。谁能告诉我如何使用命令行来实现这一点?我正在使用 Ubuntu 16.04 发行版。
答案1
下面使用 perl 的-i
选项来就地编辑输入文件。
#!/usr/bin/perl -i
use strict;
# Parse array of random numbers from the first argument.
my $arg1 = shift;
# remove [, ], and any whitespace.
$arg1 =~ s/\[|\]|\s+//g;
# split $arg1 on commas, build an associative array
# (aka "hash") called %a to hold the numbers.
# The hash keys are the line numbers, and the value for
# each key is just "1" - it doesn't matter what the
# value is, the only thing that matters is whether the
# key exists in the hash.
my %a;
map $a{$_} = 1, split(/,/, $arg1);
# Loop over each input file.
while (<>) {
# Print each line unless the current line number $. is in %a.
print unless defined $a{$.};
# reset $. at the end of each file.
close(ARGV) if eof;
}
将其另存为,例如,delete-lines.pl
并使用 使其可执行chmod +x delete-lines.pl
,然后运行它,如下所示:
$ a="[145,11,140,119,183,178,225,131,1,65,213,115,207,41,194,221,10,205,6,57,224,108,44,85,211]"
$ ./delete-lines.pl "$a" textfile*.txt
如果textfile1.txt
, textfile2.txt
,textfile3.txt
都包含以下内容前执行:
I have a series of 297 directories named as "dir000', 'dir001' and so on, each
of which contains a text file called "config", which is a csv file with 3
columns and 256 rows.
I have generated 25 random numbers in the range 1 to 256, and from all these
files in each directory, I am required to remove those exact 25 rows.
For instance, if my generator gave me a series of random numbers a =
[145,11,140,119,183,178,225,131,1,65,213,115,207,41,194,221,10,205,6,57,224,10
8,44,85,211], I want to delete exactly these rows from each of the ASCII
files("config") in each directory.
Can anyone let me know how this can be achieved using command line? I am using
Ubuntu 16.04 distribution.
然后他们都会包含这个后执行:
of which contains a text file called "config", which is a csv file with 3
columns and 256 rows.
I have generated 25 random numbers in the range 1 to 256, and from all these
For instance, if my generator gave me a series of random numbers a =
[145,11,140,119,183,178,225,131,1,65,213,115,207,41,194,221,10,205,6,57,224,10
Can anyone let me know how this can be achieved using command line? I am using
Ubuntu 16.04 distribution.
即线1,6,10, 和11已从每个文件中删除 - 因为这些是文件中随机数数组中唯一的行号。
顺便说一句,%a
哈希包含以下内容:
{
1 => 1, 6 => 1, 10 => 1, 11 => 1, 41 => 1,
44 => 1, 57 => 1, 65 => 1, 85 => 1, 108 => 1,
115 => 1, 119 => 1, 131 => 1, 140 => 1, 145 => 1,
178 => 1, 183 => 1, 194 => 1, 205 => 1, 207 => 1,
211 => 1, 213 => 1, 221 => 1, 224 => 1, 225 => 1,
}
下一步是在编号目录中的许多名为“config”的文件上运行它:
find dir[0-9]*/ -type f -name config -exec ./delete-lines.pl "$a" {} +
这假设随机数数组仍在 shell 变量 中$a
。如果您愿意,您可以使用另一个变量名称,或者只是将其作为带引号的字符串提供 - 只要您提供数组作为第一的perl 脚本的参数(所有后续参数都是文件名),它将起作用。
如果您不想保存独立脚本,可以将其作为单行脚本运行:
$ find dir[0-9]*/ -type f -name config -exec perl -i -e \
'map $a{$_} = 1, split(/,/, ($ARGV[0] =~ s/\[|\]| +//g, shift));
while (<>) {print unless defined $a{$.}; close(ARGV) if eof}' \
"$a" {} +
但你为什么要这么做呢?它只会变得丑陋并且难以阅读和编辑。在您最喜欢的编辑器中编写临时的一次性脚本比尝试在 shell 命令行上编辑和调试脚本更容易、更方便。