我想提取所有命令/程序~/.bash_history
并计算每个命令/程序被调用的次数。程序的参数无关紧要,应忽略。我使用这个 bash oneliner 来做到这一点:
cut -f 1 -d ' ' ~/.bash_history | sort | uniq -c | sort -h
但这种方法会漏掉一些命令。例如,下面几行中的命令mount
、tee
和应该被计算在内:mktemp
cut
sudo mount /dev/sdb2 /foo/bar
candump can0 | tee canlog
T=$(mktemp -d)
diff <(cut -f 2 -d ' ' ./foo ) ./bar
但事实并非如此。如何计算每个命令被调用的次数?有合理的方法吗?
我不在乎强制内置命令(如exit
)是否被计算在内,但其他内置命令(如echo
)也作为普通程序存在,应该被计算在内,别名也应该被计算在内。没有函数,也没有 if/else 分支,循环关键字和/或与循环在同一行的命令是否被计算在内并不重要,因为数量很少,而且数字不需要精确。
我想这样做,这样我就能知道每个命令被调用了多少次,这样我就可以为最常用的命令使用最短的别名。
编辑
对于此示例~/.bash_history
文件:
git commit
sudo mount /dev/sdb2 /foo/bar
gcc -Wall -o foo ./bar.c
git add bar.c
T=$(mktemp -d)
diff <(cut -f 2 -d ' ' ./foo ) <(cut -f 2 -d ' ' ./bar )
diff <(cut -f 2 -d ' ' ./foo1 ) <(cut -f 2 -d ' ' ./bar )
diff <(cut -f 2 -d ' ' ./foo2 ) <(cut -f 2 -d ' ' ./bar )
Vd <(cut -f 2 -d ' ' ./foo2 ) <(cut -f 2 -d ' ' ./bar )
T2=$(mktemp -d)
我期望输出如下:
1 sudo
1 gcc
1 Vd
1 mount
2 mktmp
2 git
3 diff
8 cut
但我明白这一点:
1 sudo
1 gcc
1 Vd
1 T=$(mktemp
1 T2=$(mktemp
2 git
3 diff
在这种情况下Vd
是 的别名vimdiff
。
编辑2
我想出了一个可能更准确的脚本,但仍然有一些问题。我现在检查每一个可能的程序和别名,并计算它在 中出现的次数~/bash_history
。但仍然存在一些问题。如果任何其他程序的参数与程序或别名的名称匹配,它将被计算在内,但事实并非如此。
对于上面的例子我得到了这个输出:
<many programs with count 0 that i removed, they don't matter>
1 gcc
1 mount
1 sudo
1 Vd
2 c
2 git
2 mktemp
3 diff
8 cut
列出别名c
是因为它与c
in匹配bar.c
。但除此之外,输出应如预期。
这是我编写的脚本:
#!/bin/bash
#Exit script when a program encounters an error or a variable is used which was not defined
set -eu
#Use ~/.bash_history if no argument is given, otherwise use $1
INPUT="~/.bash_history"
if [ $# -ge 1 ]
then
INPUT="$1"
fi
#Store the code to count the occurences of a program in a input file in the variable FUNCTIONCODE
#Do this because we can not call a function from find, so we have to store the function code and
# to find, to work around this limitation.
#The code must be called with "$FUNCTIONCODE <input file> <programToCount>" to a shell interpreter
#
#The code counts how many times a program occurs in the given input file and prints the count + word.
#
#The -d argument means read till the end. But then read returns a non-true value, so the || true
# at the end makes bash ignoring the non-true value
read -r -d '' FUNCTIONCODE << '.EOT' || true
#Needs the input file as $1
#Need a function as argument at $2
count()
{
#Get the program name in case we became the complete path
program="$(basename $2)"
#grep -w will only search for full words. -F will ignore any regex and -o will print only the
# matching word. We can't use -c because that will only count one per line, even when there are
# multiple matches on a single line.
printf "%5i %s\n" "$(grep -o -w -F "$program" "$1" | wc -l)" "$program"
}
#Call the function. Since this FUNCTIONCODE string will be combined with the program argument, the
# program is given to the function count as an argument
count
.EOT
#Print how many times each function is called
PrintAll()
{
#Check every program in every PATH location
for DIR in $(echo "$PATH" | tr ":" "\n")
do
find "$DIR/" -type f,l -executable -exec bash -c "$FUNCTIONCODE $INPUT {}" \;
done
#Check every alias in ~/bashrc . We assume all aliases start at the beginning of a line.
for ALIAS in $(grep '^alias' ~/.bashrc | cut -d ' ' -f 2 | cut -d '=' -f 1)
do
bash -c "$FUNCTIONCODE $INPUT $ALIAS"
done
#Check every alias in ~/.alias
for ALIAS in $(grep '^alias' ~/.alias | cut -d ' ' -f 2 | cut -d '=' -f 1)
do
bash -c "$FUNCTIONCODE $INPUT $ALIAS"
done
}
#Use uniq because some programs are listed twice, this happens for exampel when an alias matches the
# name of a program.
PrintAll | sort -h | uniq