我正在学习使用重定向。典型的操作如下:
command > file 2>&1
参考APUE 3.10和3.12,我认为关键的系统调用顺序如下:
open(file) == 3
dup2(3,1)
dup2(1,2)
为了测试我的想法,我创建了 shell 脚本并使用 strace 命令执行它,我的 test.sh 如下:
#!/bin/sh
echo 'hello'
whatfuckis # For get a stderr
然后我使用 strace 命令来跟踪系统调用和信号
strace -y -o trace.log ./test.sh > tmp.txt 2>&1
tmp.txt 按预期获取错误消息。我可以在trace.log中看到许多系统调用,例如(读,写,关闭,fcntl..等),我什至看到文件描述符2(stderr)指向tmp.txt的结果。
问题是我没有看到像dup2这样的文件描述符复制的具体操作,这很奇怪。如何跟踪文件描述符重复? Trace.log的主要信息如下:
read(10</home/madhouse/Applications/Lab/test.sh>, "#!/bin/sh\n\necho 'hello'\nwhatfuck"..., 8192) = 54
write(1</home/madhouse/Applications/Lab/tmp.txt>, "hello\n", 6) = 6
stat("/home/MATLAB/R2016b/bin/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/home/madhouse/Qt5.14.2/5.14.2/gcc_64//whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/home/madhouse/Qt5.14.2/Tools/QtCreator/bin/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/usr/local/lib/nodejs/node-v12.22.5-linux-x64/bin/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/home/madhouse/.local/bin/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/usr/local/sbin/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/usr/local/bin/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/usr/bin/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/sbin/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/bin/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/usr/games/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/usr/local/games/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
stat("/snap/bin/whatfuckis", 0x7ffc61f2a380) = -1 ENOENT (No such file or directory)
write(2</home/madhouse/Applications/Lab/tmp.txt>, "./test.sh: 4: ", 14) = 14
write(2</home/madhouse/Applications/Lab/tmp.txt>, "whatfuckis: not found", 21) = 21
write(2</home/madhouse/Applications/Lab/tmp.txt>, "\n", 1) = 1
read(10</home/madhouse/Applications/Lab/test.sh>, "", 8192) = 0
exit_group(127) = ?
+++ exited with 127 +++
aviro 的日志
167936 execve("/usr/bin/sh", ["sh", "-c", "./test.sh > tmp.txt 2>&1"], 0x7ffc000ac898 /* 76 vars */) = 0
167936 dup2(3</home/madhouse/Applications/Lab/tmp.txt>, 1) = 1</home/madhouse/Applications/Lab/tmp.txt>
167936 close(3</home/madhouse/Applications/Lab/tmp.txt>) = 0
167936 dup2(1</home/madhouse/Applications/Lab/tmp.txt>, 2) = 2</home/madhouse/Applications/Lab/tmp.txt>
167936 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f67e121d850) = 167937
167937 execve("./test.sh", ["./test.sh"], 0x55855ac8c2d8 /* 76 vars */) = 0
167937 write(1</home/madhouse/Applications/Lab/tmp.txt>, "hello\n", 6) = 6
167937 write(2</home/madhouse/Applications/Lab/tmp.txt>, "./test.sh: 4: ", 14) = 14
167937 write(2</home/madhouse/Applications/Lab/tmp.txt>, "whatfuckis: not found", 21) = 21
167937 write(2</home/madhouse/Applications/Lab/tmp.txt>, "\n", 1) = 1
167936 dup2(10</dev/pts/2>, 1</home/madhouse/Applications/Lab/tmp.txt>) = 1</dev/pts/2>
167936 dup2(11</dev/pts/2>, 2</home/madhouse/Applications/Lab/tmp.txt>) = 2</dev/pts/2>
答案1
你有两个问题:
- 您正在对命令执行重定向
strace
,因此重定向将由交互式 shell 执行,并且strace
将无法看到相关的系统调用。 - 系统
dup2
调用在命令执行之前在子进程内部执行。默认情况下,strace
不会跟随您的命令的子级,因此您将无法看到子级本身的踪迹。您需要将-f
标记添加到您的标记中,strace
以确保它也跟随子项。
最重要的是,为了执行预期的检查,您需要运行:
strace -yf -o trace.log sh -c './test.sh > tmp.txt 2>&1'
然后你会看到你期望的序列:
$ grep -E 'dup2\(|clone\(|execve\(|(open|write|close)\(.*tmp.txt' trace.log
31769 execve("/usr/bin/sh", ["sh", "-c", "./test.sh > tmp.txt 2>&1"], [/* 124 vars */]) = 0
31769 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ffff7fd19d0) = 31770
31770 open("tmp.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3</tmp/tmp.txt>
31770 dup2(3</tmp/tmp.txt>, 1</dev/pts/445>) = 1</tmp/tmp.txt>
31770 close(3</tmp/tmp.txt>) = 0
31770 dup2(1</tmp/tmp.txt>, 2</dev/pts/445>) = 2</tmp/tmp.txt>
31770 execve("./test.sh", ["./test.sh"], [/* 123 vars */]) = 0
31770 dup2(3</tmp/test.sh>, 255) = 255</tmp/test.sh>
31770 write(1</tmp/tmp.txt>, "hello\n", 6) = 6
31770 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ffff7fd19d0) = 31771
31771 write(2</tmp/tmp.txt>, "./test.sh: line 3: whatfuckis: c"..., 49) = 49
请注意,在实际执行之前,open
和系统调用是在命令的子命令dup2
pid 内执行的。31770
sh -c...
test.sh
在父级而不是子级中复制文件描述符
有可能(取决于 shell 的实现)在执行clone
.在这种情况下,家长需要保存原件标准输出和标准错误错误率文件描述符,并在子进程终止后恢复它们。
$ grep -E 'dup2\(|clone\(|execve\(|(open|write|close|openat)\(.*tmp.txt|F_DUPFD|exit_group' trace.log
1094 execve("/usr/bin/sh", ["sh", "-c", "./test.sh > tmp.txt 2>&1"], 0x7fffd0324118 /* 20 vars */) = 0
1094 openat(AT_FDCWD, "tmp.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3</tmp/tmp.txt>
1094 fcntl(1</dev/pts/0>, F_DUPFD, 10) = 10</dev/pts/0>
1094 dup2(3</tmp/tmp.txt>, 1) = 1</tmp/tmp.txt>
1094 close(3</tmp/tmp.txt>) = 0
1094 fcntl(2</dev/pts/0>, F_DUPFD, 10) = 11</dev/pts/0>
1094 dup2(1</tmp/tmp.txt>, 2) = 2</tmp/tmp.txt>
1094 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f844aa71690) = 1095
1095 execve("./test.sh", ["./test.sh"], 0x7f844aa9dc08 /* 20 vars */) = 0
1095 fcntl(3</tmp/test.sh>, F_DUPFD, 10) = 10</tmp/test.sh>
1095 write(1</tmp/tmp.txt>, "hello\n", 6) = 6
1095 write(2</tmp/tmp.txt>, "./test.sh: 3: ", 14) = 14
1095 write(2</tmp/tmp.txt>, "whatfuckis: not found", 21) = 21
1095 write(2</tmp/tmp.txt>, "\n", 1) = 1
1095 exit_group(127) = ?
1094 dup2(10</dev/pts/0>, 1</tmp/tmp.txt>) = 1</dev/pts/0>
1094 dup2(11</dev/pts/0>, 2</tmp/tmp.txt>) = 2</dev/pts/0>
1094 exit_group(127)
看看dup2
在开始运行每个之前,父进程 (pid 1094
) 如何将文件描述符 1 和 2 分别复制到文件描述符 10 和 11 中(使用fcntl
系统调用和F_DUPFD
命令)。
然后,在子进程 (pid 1095
) 完成后,父进程恢复文件描述符 10 和 11(原始文件描述符)标准输出和标准错误错误率)回到文件描述符 1 和 2。