经济有效地将每一行与另一个文件的行配对

Question 1

通过下面完成此操作 -

awk 'FNR==NR{a[i++]=$0; max=i; next} 
{if ((NR % max) == 0) {i=max-1} else {i=(NR%max) - 1}; 
printf "%s,%s\n",$0,a[i]}' smaller_file larger_file

但如果有人知道比这更快的方法，请建议

Answer

通过下面完成此操作 -

awk 'FNR==NR{a[i++]=$0; max=i; next} 
{if ((NR % max) == 0) {i=max-1} else {i=(NR%max) - 1}; 
printf "%s,%s\n",$0,a[i]}' smaller_file larger_file

但如果有人知道比这更快的方法，请建议

Question 2

您似乎希望循环浏览较小文件的内容

和awk

awk 'NR == FNR{a[++i]=$0; next}; 
 {print $0, a[FNR % i? FNR % i: i]}' smaller_file larger_file

和python

from itertools import cycle, izip
with open('larger_file') as f1, open('smaller_file') as f2:
    z = izip(f1, cycle(f2))
    for l, m in z:
           print l.rstrip('\n'), m.rstrip('\n')

Answer

您似乎希望循环浏览较小文件的内容

和awk

awk 'NR == FNR{a[++i]=$0; next}; 
 {print $0, a[FNR % i? FNR % i: i]}' smaller_file larger_file

和python

from itertools import cycle, izip
with open('larger_file') as f1, open('smaller_file') as f2:
    z = izip(f1, cycle(f2))
    for l, m in z:
           print l.rstrip('\n'), m.rstrip('\n')

Question 3

paste -d",''," ./file1 - ./file2 - - </dev/null >out

...给出写入输出的示例数据：

'1','1','1,2',
'2','2','1,3',
'3','3','1,4',
'4','4','1,5',
'5','5','1,6',
'6','6','1,7',
'7','7','1,8',
'8','8','1,9',
'9','9','1,10',
'10','10','2,1',
,'2,3',
,'2,4',
,'2,5',
,'2,6',
,'2,7',
,'2,8',
,'2,9',
,'',

对我来说，准确说出停止输出的标准是什么有点困难，但要编写与示例输出相同的输出：

{   paste -d",''," ./file1 - ./file2 - - |
    sed -ne's/,/&/4p;t' -eq
}   </dev/null

'1','1','1,2',
'2','2','1,3',
'3','3','1,4',
'4','4','1,5',
'5','5','1,6',
'6','6','1,7',
'7','7','1,8',
'8','8','1,9',
'9','9','1,10',
'10','10','2,1',

Answer

paste -d",''," ./file1 - ./file2 - - </dev/null >out

...给出写入输出的示例数据：

'1','1','1,2',
'2','2','1,3',
'3','3','1,4',
'4','4','1,5',
'5','5','1,6',
'6','6','1,7',
'7','7','1,8',
'8','8','1,9',
'9','9','1,10',
'10','10','2,1',
,'2,3',
,'2,4',
,'2,5',
,'2,6',
,'2,7',
,'2,8',
,'2,9',
,'',

对我来说，准确说出停止输出的标准是什么有点困难，但要编写与示例输出相同的输出：

{   paste -d",''," ./file1 - ./file2 - - |
    sed -ne's/,/&/4p;t' -eq
}   </dev/null

'1','1','1,2',
'2','2','1,3',
'3','3','1,4',
'4','4','1,5',
'5','5','1,6',
'6','6','1,7',
'7','7','1,8',
'8','8','1,9',
'9','9','1,10',
'10','10','2,1',

Question 4

正如许多人已经指出的那样，粘贴是正确的工具。

paste -d ,\'\' file1 /dev/null file2 /dev/null

如果file2短于file1，那么paste将表现为末尾有尽可能多的空行来匹配file2。

如果您想file2一遍又一遍地重复，请一遍又一遍地重复，直到达到的行数file1。

while true; do cat file2; done | head -n "$(wc -l file1)" |
paste -d ,\'\' file1 /dev/null - /dev/null

这需要翻file1两遍。根据 CPU 和 I/O 的相对速度，paste避免使用能够以更灵活的方式处理多个文件的工具（例如 awk）可能会更快。这是一个 awk 解决方案，不需要将任何一个文件完全加载到内存中（如果file2文件很小，磁盘缓存无论如何都会处理这个问题）。

awk -v file2=file2 '
    !getline s <file2 {close(file2); getline s <file2}
    {print $0 ",\047" s "\047"}' file1

说明：getline s <file2读取中的下一行file2，必要时打开它。如果失败（因为已到达文件末尾），请关闭文件并重新开始。

Answer