我的场景是:
首先使用以下命令并排生成diff
两个文件:
diff -y --supress-common-lines file1.txt file2.txt > DiffResult.txt
输出DiffResult.txt
:
file1.txt file2.txt
This is line A | This is line B
This is line C | This is line D
现在让我们说一下这条线
This is line A
和
This is line B
分别位于file1.txt
和 的第 5 行file2.txt
。然后我应该能够将适当的行号与其关联起来,如下所示:
期望的输出DiffResult.txt
:
file1.txt file2.txt
5 This is line A | 5 This is line B
7 This is line C | 7 This is line D
我采用这种方法的原因是,如果我在 之前生成行号diff
,那么即使对于小的空白更改,diff
由于与行关联的行号,也会显示出差异。
有人有好主意吗?我认为这是 StackExchange 中提出过的最棘手的问题,我相信:D
答案1
该问题可以通过过滤 的输出来解决diff
。这个示例对我有用(尽管 diff 输出的左侧/右侧之间的装订线的位置和大小可能是实现之间不同的细节):
#!/bin/sh
# $Id: diff-two-column,v 1.2 2016/09/26 20:38:32 tom Exp $
# see http://unix.stackexchange.com/questions/312025/how-to-associate-line-number-from-a-file-to-the-side-by-side-diff-output-result
usage() {
cat >&2 <<EOF
usage: $0 file1 file2
EOF
exit 1
}
[ $# = 2 ] || usage
[ -f "$1" ] || usage
[ -f "$2" ] || usage
width=${COLUMNS:-80}
check=$(stty size|cut -d' ' -f2)
[ -n "$check" ] && width=$check
diff -W $width -y "$1" "$2" | \
expand | \
awk -v width=$width '
BEGIN {
L=0;
R=0;
gutter = width / 2;
half = gutter - 2;
}
{
textL = substr($0, 1, half - 1);
sub("[ ]+$", "", textL); # trim trailing blanks
# The script relies on correctly extracting textM, the gutter:
# if lines differ, textM is " ! "
# if line inserted, textM is " > "
# if line deleted, textM is " < "
# if lines unchanged, textM is " "
textM = substr($0, gutter - 2, 3);
textR = ( length($0) > gutter ) ? substr($0, gutter+1, half) : "";
if ( textM != " > " ) {
L++;
}
if ( textM != " < " ) {
R++;
}
if ( textL != textR ) {
# printf "SHOW %s\n", $0;
# printf "gap \"%s\"\n", textM;
# printf "<<< \"%s\"\n", textL;
# printf ">>> \"%s\"\n", textR;
if ( textL == "" ) {
printf "%5s %-*s %-3s %5d %s\n",
" ", half, textL,
textM,
R, textR;
} else if ( textR == "" ) {
printf "%5d %-*s %-3s %5s %s\n",
L, half, textL,
textM,
" ", textR;
} else {
printf "%5d %-*s %-3s %5d %s\n",
L, half, textL,
textM,
R, textR;
}
} else {
# printf "SKIP %s\n", $0;
}
}
'
您无法添加行号前 diff
,因为如果有插入或删除,从该点开始的行号将不匹配,从而使差异没有用处。我的脚本计算 awk 脚本中差异的左侧/右侧的行号:
- 它首先根据终端的宽度决定差异的宽度。
- 有(在我测试过的 GNU diff 3.2 中)排水沟(未使用的空间)位于并排差异的中间。从 80 列终端开始,我确定了一种计算装订线位置的方法。
- 初始化后,脚本从每一行(在 中
awk
,这是$0
)提取左 (textL
) 和右 (textR
) 字符串,并测试它们是否为空(如果存在插入/删除,则会发生这种情况)。 - 如果左/右行不同,脚本将重建输出
diff
,但添加行号。
鉴于左边这个
1
2
3
4
This is line A
6
This is line C
123456789.123456789.123456789.123456789.123456789.
yyy
右边这个
1
2
3
4
This is line B
6
This is line D
abcdefghi.abcdefghi.abcdefghi.abcdefghi.abcdefghi.
xxx
(左边 10 行,右边 9 行),该脚本生成
5 This is line A | 5 This is line B
7 This is line C | 7 This is line D
8 123456789.123456789.123456789.1234567 | 8 abcdefghi.abcdefghi.abcdefghi.abcdefg
| 9 xxx
10 yyy <