#!/bin/awk
BEGIN {
while(getline var < compareTo > 0)
{
orderIds[var]=var;
}
}
{
if(orderIds[$0] == "")
{
print $0;
}
}
运行为
awk -v compareTo="ids.log.remote" -f sample.awk ids.log.local
这是可行的,但不是使用关联数组(如 HashMap ),而是 awk 中有类似 HashSet 的东西吗?
我得到了时间安排
bash-3.2$ time grep -xFvf ids.log.local ids.log.remote > /dev/null
real 0m0.130s
user 0m0.127s
sys 0m0.002s
bash-3.2$ time grep -xFvf ids.log.local ids.log.remote > /dev/null
real 0m0.126s
user 0m0.125s
sys 0m0.000s
bash-3.2$ time grep -xFvf ids.log.local ids.log.remote > /dev/null
real 0m0.131s
user 0m0.128s
sys 0m0.002s
bash-3.2$ time awk 'NR == FNR {
orderIds[$0]; next
}
!($0 in orderIds)
' ids.log.local ids.log.remote > /dev/null
real 0m0.053s
user 0m0.051s
sys 0m0.003s
bash-3.2$ time awk 'NR == FNR {
orderIds[$0]; next
}
!($0 in orderIds)
' ids.log.local ids.log.remote > /dev/null
real 0m0.052s
user 0m0.051s
sys 0m0.001s
bash-3.2$ time awk 'NR == FNR {
orderIds[$0]; next
}
!($0 in orderIds)
' ids.log.local ids.log.remote > /dev/null
real 0m0.053s
user 0m0.051s
sys 0m0.002s
bash-3.2$ time awk -v compareTo="ids.log.local" -f checkids.awk ids.log.remote > /dev/null
real 0m0.066s
user 0m0.060s
sys 0m0.006s
bash-3.2$ time awk -v compareTo="ids.log.local" -f checkids.awk ids.log.remote > /dev/null
real 0m0.065s
user 0m0.058s
sys 0m0.008s
bash-3.2$ time awk -v compareTo="ids.log.local" -f checkids.awk ids.log.remote > /dev/null
real 0m0.061s
user 0m0.053s
sys 0m0.007s
@Dimitre Radoulov 看起来你的 awk 更快。谢谢。
答案1
我相信这是在 awk 中执行此操作的最有效方法:
awk 'NR == FNR {
orderIds[$0]; next
}
!($0 in orderIds)
' ids.log.remote ids.log.local
您也可以尝试使用 grep:
grep -xFVf ids.log.remote ids.log.local