排序命令：-g 与 -n 标志

Question 1

主要的区别在于对数字的处理科学计数法. 从开始info sort，使用-n（数字）排序时

 Neither a leading `+' nor exponential notation is recognized.  To
 compare such strings numerically, use the `--general-numeric-sort'
 (`-g') option.

例如，

$ cat file
+1.23e-1
1.23e-2
1.23e-3
1.23e4
1.23e+5
-1.23e6

然后

$ sort -n file
-1.23e6
+1.23e-1
1.23e-2
1.23e-3
1.23e4
1.23e+5

然而

$ sort -g file
-1.23e6
1.23e-3
1.23e-2
+1.23e-1
1.23e4
1.23e+5

Answer

主要的区别在于对数字的处理科学计数法. 从开始info sort，使用-n（数字）排序时

 Neither a leading `+' nor exponential notation is recognized.  To
 compare such strings numerically, use the `--general-numeric-sort'
 (`-g') option.

例如，

$ cat file
+1.23e-1
1.23e-2
1.23e-3
1.23e4
1.23e+5
-1.23e6

然后

$ sort -n file
-1.23e6
+1.23e-1
1.23e-2
1.23e-3
1.23e4
1.23e+5

然而

$ sort -g file
-1.23e6
1.23e-3
1.23e-2
+1.23e-1
1.23e4
1.23e+5

Question 2

从sort信息页面，排序-g由这些解释

‘-g’
‘--general-numeric-sort’
‘--sort=general-numeric’
     Sort numerically, converting a prefix of each line to a long
     double-precision floating point number.  *Note Floating point::.
     Do not report overflow, underflow, or conversion errors.  Use the
     following collating sequence:

        • Lines that do not start with numbers (all considered to be
          equal).
        • NaNs (“Not a Number” values, in IEEE floating point
          arithmetic) in a consistent but machine-dependent order.
        • Minus infinity.
        • Finite numbers in ascending numeric order (with -0 and +0
          equal).
        • Plus infinity.

     Use this option only if there is no alternative; it is much slower
     than ‘--numeric-sort’ (‘-n’) and it can lose information when
     converting to floating point.

sort -n是我们通常期望的自然排序

‘-n’
‘--numeric-sort’
‘--sort=numeric’
     Sort numerically.  The number begins each line and consists of
     optional blanks, an optional ‘-’ sign, and zero or more digits
     possibly separated by thousands separators, optionally followed by
     a decimal-point character and zero or more digits.  An empty number
     is treated as ‘0’.  The ‘LC_NUMERIC’ locale specifies the
     decimal-point character and thousands separator.  By default a
     blank is a space or a tab, but the ‘LC_CTYPE’ locale can change
     this.

     Comparison is exact; there is no rounding error.

     Neither a leading ‘+’ nor exponential notation is recognized.  To
     compare such strings numerically, use the ‘--general-numeric-sort’
     (‘-g’) option.

查看Steeldriver 的回答以获得更好的解释。

Answer

从sort信息页面，排序-g由这些解释

‘-g’
‘--general-numeric-sort’
‘--sort=general-numeric’
     Sort numerically, converting a prefix of each line to a long
     double-precision floating point number.  *Note Floating point::.
     Do not report overflow, underflow, or conversion errors.  Use the
     following collating sequence:

        • Lines that do not start with numbers (all considered to be
          equal).
        • NaNs (“Not a Number” values, in IEEE floating point
          arithmetic) in a consistent but machine-dependent order.
        • Minus infinity.
        • Finite numbers in ascending numeric order (with -0 and +0
          equal).
        • Plus infinity.

     Use this option only if there is no alternative; it is much slower
     than ‘--numeric-sort’ (‘-n’) and it can lose information when
     converting to floating point.

sort -n是我们通常期望的自然排序

‘-n’
‘--numeric-sort’
‘--sort=numeric’
     Sort numerically.  The number begins each line and consists of
     optional blanks, an optional ‘-’ sign, and zero or more digits
     possibly separated by thousands separators, optionally followed by
     a decimal-point character and zero or more digits.  An empty number
     is treated as ‘0’.  The ‘LC_NUMERIC’ locale specifies the
     decimal-point character and thousands separator.  By default a
     blank is a space or a tab, but the ‘LC_CTYPE’ locale can change
     this.

     Comparison is exact; there is no rounding error.

     Neither a leading ‘+’ nor exponential notation is recognized.  To
     compare such strings numerically, use the ‘--general-numeric-sort’
     (‘-g’) option.

查看Steeldriver 的回答以获得更好的解释。

Question 3

从手册sort：

‘-n’
‘--numeric-sort’
‘--sort=numeric’

按数字排序。每行开头的数字由可选的空格、可选的“-”号和零个或多个数字组成，这些数字可能由千位分隔符分隔，后面可选跟一个小数点字符和零个或多个数字。空数被视为“0”。语言LC_NUMERIC环境指定小数点字符和千位分隔符。默认情况下，空格是空格或制表符，但语言LC_CTYPE环境可以更改这一点。

比较准确；没有舍入误差。

前导“+”和指数符号均无法识别。要以数字方式比较此类字符串，请使用--general-numeric-sort ( -g) 选项。

和;

'-G'
‘--general-numeric-sort’（一般数字排序）
‘--sort=general-numeric’

按数字排序，将每行的前缀转换为长双精度浮点数。请参阅浮点。不报告溢出、下溢或转换错误。使用以下排序顺序：

不以数字开头的行（均视为相等）。

NaN（IEEE 浮点算术中的“非数字”值）以一致但依赖于机器的顺序排列。

负无穷。

按升序排列的有限数（-0 和 +0 相等）。

正无穷。

仅在没有其他选择时才使用此选项；它比--numeric-sort（-n）慢得多，并且在转换为浮点时可能会丢失信息。

因此，似乎-g由于精度损失，使用可能会造成比较不正确，但无论出于何种原因，我都无法产生这样的结果：

$ printf "%s\n" 1 1.23 1.234 1.2345 1.23456 1.234567 1.2345678 1.23456789 1.23456788888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888 1.23456788888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888878888888888 | sort -g
1
1.23
1.234
1.2345
1.23456
1.234567
1.2345678
1.23456788888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888878888888888
1.23456788888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888
1.23456789

sort -g正确地将第二个长小数放在第一个长小数之前，但两者之间的差异远远超出了 a 的精度double：

$ cat test.cpp  
#include<iostream>

using namespace std;

int main()
{
    cout << (1.23456788888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888887888888888888888888888 < 1.23456788888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888) << endl;
    cout << (1.23456788888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888887888888888888888888888 > 1.23456788888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888) << endl;
}
$ make test     
g++     test.cpp   -o test
$ ./test        
0
0

Answer