当 docker 容器运行相同的应用程序（例如 apache2）时，主机上的 `pidof` 检查失败

Question 1

遗憾的是，init 脚本中有一个错误# can't use pidofproc from LSB here，但没有真正的解释。我仍然认为这个 apache2 脚本有一个值得报告的错误。

TL;DR：解决方案：替换pidof apache2为pgrep --ns 1 ^apache2$（或者，如果这不起作用，pgrep --ns 1 --nslist uts ^apache2$）

关于命名空间的详细解释以及我之前写的示例pgrep可以这样做：

使用获得“候选者”后pidof，可以使用以下方法将它们分开：检查它们的命名空间，并将它们与pid 1(init/systemd) 的命名空间进行比较。使用lxc和inetd进程的示例，但这是容器的技术和进程名称不可知的：

# lxc-start stretch-amd64
# pidof inetd
10285 3372
# ls -l /proc/1/ns/
total 0
lrwxrwxrwx. 1 root root 0 nov.   9 19:49 cgroup -> cgroup:[4026531835]
lrwxrwxrwx. 1 root root 0 nov.   9 19:49 ipc -> ipc:[4026531839]
lrwxrwxrwx. 1 root root 0 nov.   9 19:49 mnt -> mnt:[4026531840]
lrwxrwxrwx. 1 root root 0 nov.   9 19:49 net -> net:[4026531993]
lrwxrwxrwx. 1 root root 0 nov.   9 19:49 pid -> pid:[4026531836]
lrwxrwxrwx. 1 root root 0 nov.   9 19:49 pid_for_children -> pid:[4026531836]
lrwxrwxrwx. 1 root root 0 nov.   9 19:49 user -> user:[4026531837]
lrwxrwxrwx. 1 root root 0 nov.   9 19:49 uts -> uts:[4026531838]
# ls -l /proc/3372/ns/
total 0
lrwxrwxrwx. 1 root root 0 nov.   9 19:51 cgroup -> cgroup:[4026531835]
lrwxrwxrwx. 1 root root 0 nov.   9 19:51 ipc -> ipc:[4026531839]
lrwxrwxrwx. 1 root root 0 nov.   9 19:51 mnt -> mnt:[4026531840]
lrwxrwxrwx. 1 root root 0 nov.   9 19:51 net -> net:[4026531993]
lrwxrwxrwx. 1 root root 0 nov.   9 19:51 pid -> pid:[4026531836]
lrwxrwxrwx. 1 root root 0 nov.   9 19:51 pid_for_children -> pid:[4026531836]
lrwxrwxrwx. 1 root root 0 nov.   9 19:51 user -> user:[4026531837]
lrwxrwxrwx. 1 root root 0 nov.   9 19:51 uts -> uts:[4026531838]
# ls -l /proc/10285/ns/
total 0
lrwxrwxrwx. 1 root root 0 nov.   9 19:50 cgroup -> cgroup:[4026532516]
lrwxrwxrwx. 1 root root 0 nov.   9 19:50 ipc -> ipc:[4026532415]
lrwxrwxrwx. 1 root root 0 nov.   9 19:50 mnt -> mnt:[4026532410]
lrwxrwxrwx. 1 root root 0 nov.   9 19:50 net -> net:[4026532418]
lrwxrwxrwx. 1 root root 0 nov.   9 19:50 pid -> pid:[4026532416]
lrwxrwxrwx. 1 root root 0 nov.   9 19:50 pid_for_children -> pid:[4026532416]
lrwxrwxrwx. 1 root root 0 nov.   9 19:50 user -> user:[4026531837]
lrwxrwxrwx. 1 root root 0 nov.   9 19:50 uts -> uts:[4026532414]

这里可以清楚地看到pid 3372共享的pid 1命名空间。3372正在主机上运行。10285不共享任何命名空间（ok 用户是相同的：容器以 root 身份运行），因此它在容器中。有时主机上运行的某些程序可能会因为某些原因（通常与安全有关）而更改其中一些，但不应该更改 uts（主机名）命名空间。所以这里有一个使用和的脚本，stat在参数“$1”（例如：或脚本的参数）中给出进程的名称，set -- inetd将仅提供相同 uts 命名空间中的进程，通常意味着（相同）主机。

pid1uts="$(stat -c %N /proc/1/ns/uts|cut -d' ' -f3)"
for i in $(pidof "$1"); do
    if [ "$pid1uts" = "$(stat -c %N /proc/$i/ns/uts|cut -d' ' -f3)" ]; then
        echo $i
    fi
done | xargs -r

在我的示例中，返回3372。

我解释了如何去做，但是当pgrep有其他选择来处理这个问题时，为什么要重新发明轮子呢：

# pgrep ^inetd$
3372
10285
# pgrep --ns 1 --nslist uts ^inetd$
3372

或者对于大多数情况来说：

# pgrep --ns 1 ^inetd$
3372

Answer