cat > file
Amy looked at her watch. He was late. The sun was setting but Jake didn’t care.
wc file
1 16 82 file
有人可以解释为什么wc
命令在这种情况下返回 3 个额外字符吗?
答案1
wc
多显示 3 个字符,因为您的示例文件包含一个奇特的 Unicode 撇号’
(很可能是因为您从浏览器或文本编辑器复制了内容):
$ cat file
Amy looked at her watch. He was late. The sun was setting but Jake didn’t care.
$ wc file
1 16 82 file
使用纯 ASCII 撇号'
:
$ cat file2
Amy looked at her watch. He was late. The sun was setting but Jake didn't care.
$ wc file
1 16 80 file2
wc
默认情况下显示每个字节数手动的:
每个文件的换行符、单词和字节数
对于字符计数,-m
可以使用参数:
$ cat file
Amy looked at her watch. He was late. The sun was setting but Jake didn’t care.
$ wc -m file
80 file.txt
答案2
通过管道传输文件xxd
以查看与 ascii 并排的十六进制输出,这将让您查看是否有您看不到或无法打印的额外字符。
$ cat file
one and two
$ cat file | wc
1 3 18
$ cat file | xxd
00000000: 6f6e 65e2 808f 2061 6e64 20e2 808f 7477 one... and ...tw
00000010: 6f0a o.
答案3
wc
计算字节数,而不是字符数。如果你想计算字符数,你应该使用-m
选项:
cat > file
Amy looked at her watch. He was late. The sun was setting but Jake didn’t care.
wc -l -w -m file
1 16 80 file
剩下的“额外字符”确实是文件末尾的换行符。