连接具有多个标头的多个文件

Question 1

假设节关键字下有“====”，以下 Python 将解决该问题，而不必明确节名称：

import sys
from collections import OrderedDict

combined = OrderedDict()
seperator = '===='

for file_name in 'f1.txt f2.txt'.split(): #sys.argv[1:]:
    with open(file_name) as fp:
        lines = fp.readlines()
        data = []
        while len(lines):
            # reverse over the lines
            line = lines.pop(-1)
            if not line.strip(): continue # skip empty
            if line.startswith(seperator):
                name = lines.pop(-1)
                section = combined.setdefault(name, [])
                section.extend(reversed(data))
                data = []
            else:
                data.append(line)

for idx, k in enumerate(reversed(combined)):
    if idx != 0:
        print # insert empty line before all but first
    sys.stdout.write(k)
    print('=' * len(k))
    for line in combined[k]:
        sys.stdout.write(line)

您必须在调用命令行时提供文件名。

这会生成输出：

Numbers
========
1
2
3
4
5
6

Letters
========
A
B
C
D
E
F

Answer

假设节关键字下有“====”，以下 Python 将解决该问题，而不必明确节名称：

import sys
from collections import OrderedDict

combined = OrderedDict()
seperator = '===='

for file_name in 'f1.txt f2.txt'.split(): #sys.argv[1:]:
    with open(file_name) as fp:
        lines = fp.readlines()
        data = []
        while len(lines):
            # reverse over the lines
            line = lines.pop(-1)
            if not line.strip(): continue # skip empty
            if line.startswith(seperator):
                name = lines.pop(-1)
                section = combined.setdefault(name, [])
                section.extend(reversed(data))
                data = []
            else:
                data.append(line)

for idx, k in enumerate(reversed(combined)):
    if idx != 0:
        print # insert empty line before all but first
    sys.stdout.write(k)
    print('=' * len(k))
    for line in combined[k]:
        sys.stdout.write(line)

您必须在调用命令行时提供文件名。

这会生成输出：

Numbers
========
1
2
3
4
5
6

Letters
========
A
B
C
D
E
F

Question 2

这是使用 Perl 脚本执行此操作的一种方法：

#!/usr/bin/perl

my @fileInfo;
open(my $fh, "<file1");
push (@fileInfo,<$fh>);
close($fh);
open(my $fh, "<file2");
push (@fileInfo,<$fh>);
close($fh);

my @letLines;
my @numLines;
my $numMode  = 0;
my $letMode  = 0;

foreach my $line (@fileInfo) {

    # skip empty and '==' lines
    next if ($line =~ /^$/);
    next if ($line =~ /==/);

    if ($line =~ /Letters/) {
        $letMode = 1;
        $numMode = 0;
        next;
    }

    if ($line =~ /Numbers/) {
        $numMode = 1;
        $letMode = 0;
        next;
    }

    if ($letMode) {
        push (@letLines, $line);
        next;
    }

    if ($numMode) {
        push (@numLines, $line);
        next;
    }
}

print "Numbers\n";
print "=======\n";
print @numLines;

print "\n";

print "Letters\n";
print "=======\n";
print @letLines;

# vim: set ts=2 nolist :

输出

Numbers
=======
1
2
3
4
5
6

Letters
=======
A
B
C
D
E
F

Answer

这是使用 Perl 脚本执行此操作的一种方法：

#!/usr/bin/perl

my @fileInfo;
open(my $fh, "<file1");
push (@fileInfo,<$fh>);
close($fh);
open(my $fh, "<file2");
push (@fileInfo,<$fh>);
close($fh);

my @letLines;
my @numLines;
my $numMode  = 0;
my $letMode  = 0;

foreach my $line (@fileInfo) {

    # skip empty and '==' lines
    next if ($line =~ /^$/);
    next if ($line =~ /==/);

    if ($line =~ /Letters/) {
        $letMode = 1;
        $numMode = 0;
        next;
    }

    if ($line =~ /Numbers/) {
        $numMode = 1;
        $letMode = 0;
        next;
    }

    if ($letMode) {
        push (@letLines, $line);
        next;
    }

    if ($numMode) {
        push (@numLines, $line);
        next;
    }
}

print "Numbers\n";
print "=======\n";
print @numLines;

print "\n";

print "Letters\n";
print "=======\n";
print @letLines;

# vim: set ts=2 nolist :

输出

Numbers
=======
1
2
3
4
5
6

Letters
=======
A
B
C
D
E
F

Question 3

您提供的输入的numbersletters.sh：

#!/bin/bash

echo Numbers

for f in "$@"
do
   grep -E '^[0-9]+$' "$f" 
done

echo
echo Letters

for f in "$@"
do
    grep -E '^[A-Z]+$' "$f"
done

echo

numbersletters.sh file1 file2回报

Numbers
1
2
3
4
5
6

Letters
A
B
C
D
E
F

您可以添加| sort -ufor for 循环以使其排序且唯一。

Answer

您提供的输入的numbersletters.sh：

#!/bin/bash

echo Numbers

for f in "$@"
do
   grep -E '^[0-9]+$' "$f" 
done

echo
echo Letters

for f in "$@"
do
    grep -E '^[A-Z]+$' "$f"
done

echo

numbersletters.sh file1 file2回报

Numbers
1
2
3
4
5
6

Letters
A
B
C
D
E
F

您可以添加| sort -ufor for 循环以使其排序且唯一。

Question 4

这是一个简单的 Perl 解决方案，适用于任意标头名称：

#!/usr/bin/env perl 
my ($last,$head); 
my %k;  ## this will hold our data
my $separator="=====";
my @lines=<>; ## Read all lines
for(my $i=0; $i<=$#lines; $i++){ ## for each line...
    next if $lines[$i]=~/$separator/; ##skip the separators
    next if $lines[$i]=~/^\s*$/;  ##skip empty lines
    if ($lines[$i+1]=~/====/) { ## if the NEXT line is a separator, 
    $head=$lines[$i];       ## the this line is a heading
    next;
    }
    else { ## save the contents of this group into the %k hash.
    push @{$k{$head}},$lines[$i];
    }
}

## Now print everything
foreach (keys(%k)) {
    print "$_========\n", @{$k{$_}};
}

Answer

这是一个简单的 Perl 解决方案，适用于任意标头名称：

#!/usr/bin/env perl 
my ($last,$head); 
my %k;  ## this will hold our data
my $separator="=====";
my @lines=<>; ## Read all lines
for(my $i=0; $i<=$#lines; $i++){ ## for each line...
    next if $lines[$i]=~/$separator/; ##skip the separators
    next if $lines[$i]=~/^\s*$/;  ##skip empty lines
    if ($lines[$i+1]=~/====/) { ## if the NEXT line is a separator, 
    $head=$lines[$i];       ## the this line is a heading
    next;
    }
    else { ## save the contents of this group into the %k hash.
    push @{$k{$head}},$lines[$i];
    }
}

## Now print everything
foreach (keys(%k)) {
    print "$_========\n", @{$k{$_}};
}

连接具有多个标头的多个文件

文件一：

文件2：

将合并创建文件3：

答案1

答案2

输出

答案3

答案4

相关内容