更多基准

更多基准

假设我有如下文件内容:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

首先我尝试过:

time cat temp.txt

输出:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

real    0m0.001s
user    0m0.000s
sys     0m0.001s

我第二次尝试过:

time grep "$"  temp.txt

输出:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

real    0m0.002s
user    0m0.000s
sys     0m0.002s

我已经尝试了第三次:

time awk  "/$/"  temp.txt

输出:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

real    0m0.004s
user    0m0.001s
sys     0m0.004s

和:

time awk 1 temp.txt

输出:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

real    0m0.004s
user    0m0.000s
sys     0m0.003s

使用 sed:

time sed "" temp.txt

输出:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

real    0m0.002s
user    0m0.000s
sys     0m0.002s

这意味着 cat 是打印所有文件内容的更好的命令。因为它执行时间更短。?

答案1

答案是“是”。最初,这更像是一种断言,因为 cat 只是在读取文件,而其他两个正在扫描文件以查找表达式。您的time脚本是正确的想法,但在这些极短的持续时间内,任何微小的变化都会导致错误的结果。最好使用更大的文件,或者重复多次。

$ time for i in {1..1000}; do cat temp.txt; done
...
real    0m0.762s
user    0m0.060s
sys     0m0.147s

$ time for i in {1..1000}; do grep "$" temp.txt; done
...
real    0m3.128s
user    0m0.667s
sys     0m0.263s

$ time for i in {1..1000}; do awk "/$/" temp.txt; done
...
real    0m3.332s
user    0m0.720s
sys     0m0.337s

另外(未显示),我多次运行上述命令以确认每个命令大约在同一时间运行,因此是可复制的。

更多基准

根据评论,以下是我测试的更多命令。在我的系统上,虽然接近,grep "^"awk "1"效率没有明显提高。sed ""cat

$ time for i in {1..1000}; do grep "^" temp.txt; done
...
real    0m2.992s
user    0m0.527s
sys     0m0.303s

$ time for i in {1..1000}; do awk "1" temp.txt; done
...
real    0m3.185s
user    0m0.570s
sys     0m0.317s

$ time for i in {1..1000}; do sed "" temp.txt; done
...
real    0m0.983s
user    0m0.077s
sys     0m0.193s

答案2

我有相同的脚本。在一个脚本中我使用 cat,而在另一个脚本中则全部使用 AWK。

这是第一个:

#!/bin/bash


        lines=$(cat /etc/passwd | wc -l)

        for ((i=1 ; i <=$lines ; i++ ))
        do
        user=$(cat /etc/passwd | awk -F : -vi=$i 'NR==i {print $1}')
        uid=$(cat /etc/passwd | awk -F : -vi=$i 'NR==i {print $3}')
        gid=$(cat /etc/passwd | awk -F : -vi=$i 'NR==i {print $4}')
        shell=$(cat /etc/passwd | awk -F : -vi=$i 'NR==i {print $7}')
        echo -e "User is : $user \t Uid is : $uid \t Gid is : $gid \t Shell is : $shell"
        done

这是第二个:

#!/bin/bash


        lines=$(awk  'END {print NR}' /etc/passwd)

        for ((i=1 ; i <=$lines ; i++ ))
        do
        user=$(awk  -F : -vi=$i 'NR==i {print $1}' /etc/passwd)
        uid=$(awk  -F : -vi=$i 'NR==i {print $3}'  /etc/passwd)
        gid=$(awk  -F : -vi=$i 'NR==i {print $4}'  /etc/passwd)
        shell=$(awk  -F : -vi=$i 'NR==i {print $7}' /etc/passwd)
        echo -e "User is : $user \t Uid is : $uid \t Gid is : $gid \t Shell is : $shell"
        done

第一个脚本所花费的时间如下(带有 CAT 语句的脚本):

real    0m0.215s
user    0m0.023s
sys     0m0.238s

对于仅包含 AWK 语句的第二个脚本,所花费的时间如下:

real    0m0.132s
user    0m0.013s
sys     0m0.123s

我认为 awk 处理文件比调用其他外部函数读取文件要快得多。我很乐意讨论结果。

我认为 AWK 在某些情况下表现更好。

相关内容