大家好,几周前,我编写了一个 C 程序,提示用户输入文本文件的名称,然后提示用户输入单词。然后,程序输出输入文本文件,文本左侧带有数字,并输出该单词在文本文件中出现的次数。它还输出单词所在的匹配行号。
这是它的实际示例:
输入文本文件的名称:bond.txt
Enter the pattern to search for: Bond File contents: 1) Secret agent Bond had been warned not to tangle with Goldfinger. 2) But the super-criminal's latest obsession was too strong, too dangerous. 3) He had to be stopped. 4) Goldfinger was determined to take possession of half the supply of 5) mined gold in the world--to rob Fort Knox! 6) For this incredible venture he had enlisted the aid of the top 7) criminals in the U.S.A, including a bevy of beautiful thieves from the 8) Bronx. And it would take all of Bond's unique talents to make it fail-- 9) as fail it must. There is a match on line 1 There is a match on line 8 'Bond' appeared 2 times in the file bond.txt.
目前,我正在尝试通过重复我在 C 中编写的程序但使用 awk 来练习 awk 编程。
这是到目前为止我能收集到的:
BEGIN{
printf("Enter filename : ")
getline file < "-"
while((getline < file)) {
{print "File Contents:"}
{printf("%5d) %s\n", NR,$0)}
}
}
可以让我逐行解析文本文件以搜索用户输入的单词的最佳且有效的方法是什么?有什么技巧、窍门吗?谢谢。
答案1
$ awk '/Bond/{c++; print "There is a match on line " NR} END{print "\"Bond\" appeared " c " times in the file " FILENAME}' bond.txt
There is a match on line 1
There is a match on line 8
"Bond" appeared 2 times in the file bond.txt
怎么运行的
awk 隐式循环所有输入行。
/Bond/{c++; print "There is a match on line " NR}
对于与 regex 匹配的行
Bond
,计数器c
会递增,并打印一条消息,显示匹配所在的行。在 awk 中,到目前为止读取的行数是NR
。END{print "\"Bond\" appeared " c " times in the file " FILENAME}
读取最后一行后,将打印一条消息,显示匹配的总数。
多线版本
对于那些喜欢将代码分布在多行中的人:
awk '
/Bond/{
c++
print "There is a match on line " NR
}
END{
print "\"Bond\" appeared " c " times in the file " FILENAME
}
' bond.txt
在文件摘要之前显示文件内容
这种方法会读取文件两次。第一次,它打印用行号格式化的文件版本。第二次打印摘要输出:
$ awk 'FNR==NR{printf("%5d) %s\n", NR,$0);next} /Bond/{c++; print "There is a match on line " FNR} END{print "\"Bond\" appeared " c " times in the file " FILENAME}' bond.txt{,}
1) Secret agent Bond had been warned not to tangle with Goldfinger.
2) But the super-criminal's latest obsession was too strong, too dangerous.
3) He had to be stopped.
4) Goldfinger was determined to take possession of half the supply of
5) mined gold in the world--to rob Fort Knox!
6) For this incredible venture he had enlisted the aid of the top
7) criminals in the U.S.A, including a bevy of beautiful thieves from the
8) Bronx. And it would take all of Bond's unique talents to make it fail--
9) as fail it must.
There is a match on line 1
There is a match on line 8
"Bond" appeared 2 times in the file bond.txt
以上与第一个版本有两个不同之处。首先,bond.txt bond.txt
使用 bash在命令行上提供该文件两次,或者两次大括号扩展技巧,如bond.txt{,}
。
其次,我们添加了命令:
FNR==NR{printf("%5d) %s\n", NR,$0);next}
FNR==NR
仅当其中 NR 是迄今为止读取的总行数且 FNR 是从当前文件读取的行数时,才执行此命令。所以,当 时FNR==NR
,我们是第一次读取文件。然后,我们printf
格式化输出并跳转到该next
行,跳过脚本中的其余命令。
选择
在此版本中,我们仅读取文件一次,打印格式化版本,同时保存摘要信息以在最后打印:
$ awk '{printf("%5d) %s\n", NR,$0)} /Bond/{c++; s=s ORS "There is a match on line " FNR} END{print s; print "\"Bond\" appeared " c " times in the file " FILENAME}' bond.txt
1) Secret agent Bond had been warned not to tangle with Goldfinger.
2) But the super-criminal's latest obsession was too strong, too dangerous.
3) He had to be stopped.
4) Goldfinger was determined to take possession of half the supply of
5) mined gold in the world--to rob Fort Knox!
6) For this incredible venture he had enlisted the aid of the top
7) criminals in the U.S.A, including a bevy of beautiful thieves from the
8) Bronx. And it would take all of Bond's unique talents to make it fail--
9) as fail it must.
There is a match on line 1
There is a match on line 8
"Bond" appeared 2 times in the file bond.txt
答案2
以下内容应复制示例 c 代码的功能:
#!/bin/awk -f
BEGIN{
printf("Enter the name of a text file: ")
getline file < "-"
printf("Enter the pattern to search for: ")
getline searchfor < "-"
print "File contents:"
while (getline < file){
# NR does not work for files read with getline in this way, so....
linenum++
printf("%5d) %s\n",linenum,$0)
if ($0 ~ searchfor){
matchcount++
matches=matches sprintf("There is a match on line %d\n",linenum)
}
}
print matches
printf("'%s' appeared %d times in file %s.\n",searchfor,matchcount,file)
}