我有三个文件
~/naive-file.txt
~/old-text.txt
~/new-text.txt
我想找到 的内容~/old-text.txt
出现在的每个实例~/naive-file.txt
,并将这些出现的内容替换为 的内容~/new-text.txt
。我确信这可以使用sed
or来实现awk
,但我似乎无法找出正确的命令。这可能吗?
例如,假设内容~/naive-file.txt
是
$ cat ~/naive-file.txt
Sed id ligula quis est convallis tempor.
This is the old text.
It might have multiple lines and some special characters like these \ { & % #)!
etc...
Nunc aliquet, augue nec adipiscing interdum, lacus tellus malesuada massa, quis
varius mi purus non odio.
假设内容~/old-text.txt
是
$ cat ~/old-text.txt
This is the old text.
It might have multiple lines and some special characters like these \ { & % #)!
etc...
假设内容~/new-text.txt
是
$ cat ~/new-text.txt
This is the new text.
It could also have multiple lines and special characters like these \ { & %
etc...
运行我想要的命令会产生
Sed id ligula quis est convallis tempor.
This is the new text.
It could also have multiple lines and special characters like these \ { & %
etc...
Nunc aliquet, augue nec adipiscing interdum, lacus tellus malesuada massa, quis
varius mi purus non odio.
答案1
Perl 来救援!
将替换对读入散列。然后逐行读取输入并尝试替换匹配项。
#!/usr/bin/perl
use warnings;
use strict;
open my $ot, '<', 'old-text.txt' or die $!;
chomp( my @lines = <$ot> );
open my $nt, '<', 'new-text.txt' or die $!;
my %replace;
@replace{@lines} = <$nt>;
chomp for values %replace;
my $regex = join '|', map quotemeta, @lines;
open my $in, 'naive-file.txt' or die $!;
while (<$in>) {
s/($regex)/$replace{$1}/;
print;
}
如果某些待替换字符串是其他待替换字符串的子字符串,则需要将正则表达式中的字符串按长度降序排序,即
my $regex = join '|', map quotemeta, sort { length $b <=> length $a } @lines;
答案2
重击
替换第一个匹配项:
target=$(cat naive-file.txt)
old=$(cat old-text.txt)
new=$(cat new-text.txt)
echo "${target/"$old"/"$new"}"
替换所有匹配项:
echo "${target//"$old"/"$new"}"
替换开始匹配:
echo "${target/#"$old"/"$new"}"
替换结束匹配:
echo "${target/%"$old"/"$new"}"
答案3
这是 GNU 的awk
一句话:
awk 'NR==FNR{old[++k]=$0}FILENAME=="new-text.txt"{new[FNR]=$0}
FILENAME=="naive-file.txt"{for(i=1;i<k;i++)if(old[i]==$0)$0=new[i];print}'\
old-text.txt new-text.txt naive-file.txt
可能不适合非常大的文件,因为所有模式首先存储到数组中。
输出:
Sed id ligula quis est convallis tempor.
This is the new text.
It could also have multiple lines and special characters like these \ { & %
etc...
Nunc aliquet, augue nec adipiscing interdum, lacus tellus malesuada massa, quis
varius mi purus non odio.
答案4
$ perl -0777ne '
$A[@ARGV] = $_;
@ARGV and next;
my($naive, $new, $old) = @A;
while ( index($naive,$old,$p) > -1 ) {
substr($naive, index($naive,$old,$p), length($old)) = $new;
$p = index($naive,$old,length($old)) ;
}
print $naive;
' old.txt new.txt naive.txt