我有一个 data.txt 文件,数据格式如下:
a|Êó•
a|Êõ∞
a|Ôõ∏
a|Ôùã
aa|Êòå
aa|Êòç
aaa|Êô∂
aamh|Êöò
我所要做的就是将以下文本转换为此结果:
'a' => ['Êó•','Êõ∞','Ôõ∏','Ôùã'],
'aa' => ['Êòå','Êòç'],
'aaa' => ['Êô∂'],
'aamh' => ['Êöò']
对此有什么想法吗?谢谢。
答案1
一个 perl 解决方案
#!/usr/bin/perl
use strict;
use warnings;
my $key = 'a';
my @data = ();
while(<>) {
chomp($_);
my $line = $_;
my ($k, $d) = split(/\|/, $line);
if($k eq $key) {
push(@data, $d);
} else {
my $text = join ',', map { qq/'$_'/ } @data;
print "'$key' => [$text],\n";
@data = ();
push(@data, $d);
$key = $k
}
}
# this prints out any data still left
my $text = join ',', map { qq/'$_'/ } @data;
print "'$key' => [$text],\n";
答案2
viml 可以成为完成这项工作的正确工具:有时(例如 Ms Windows)安装 vim 比安装完整的 perl/*nix 框更容易。
" Parse the lines
let dict = {}
let lines=getline(1,'$')
call filter(lines, 'v:val =~ ".*|.*"')
for l in lines
let [k,v] = split(l, '|')
if has_key(dict, k)
let dict[k] += [v]
else
let dict[k] = [v]
endif
endfor
" produce the new lines ...
" ...in a new buffer
vnew
let res = []
for [k,ll] in items(dict)
let l = string(k) . ' => ' . string(ll)
let res += [l]
endfor
let sres=join(res,",\n")
put=sres