即使通过kill -9 -1也无法杀死挂起的进程

即使通过kill -9 -1也无法杀死挂起的进程

我们在 Solaris 环境中观察到,有时我们运行的进程会无限期挂起而无法被终止。唯一可能的方法是重新启动服务器。连kill -9 -1也没有成功。使用发送信号 15 (SIGTERM)、1 (SIGHUP)、2 (SIGINT) 进行杀死不起作用。

该进程不会消耗任何 CPU 内存,也不会干扰其他进程的执行。

这样的进程怎么能被杀死呢?

一些可能有用的日志:

10:03 tsstool@selilsx592[119]/proj/2gsim/usr/qzbiwis/ATE/hanging_process/selilsx592> /usr/ucb/ps -alxwww | grep -i sea
0 50167  2673     1  0  59 201115264361720 6003fa6b7ea S ?         0:44 /proj/lmdoste/tools/steroot/apps/sea/sea_R38A/lib/sea/bin/64bit/SEA -l INIT.sco,AXEMANAGER.sco,SEAGUI.sco,RTS.sco,SCHR.sco -p 5001 -e node-name
0 50167 13107 13068  0  59 20 4288 3408 3000b6f7366 S ?         0:00 tcsh -c cd "/proj/2gsim/Tools/.jenkins/SEA_selilsx592" && /app/jdk/1.7.0_55/bin/java  -jar slave.jar
0 50167 11815 11810  0  59 20 3720 3072 3001021e55c S pts/1     0:00 bash /proj/lmdoste/tools/steroot/apps/toolbox/bscste_toolbox -n /proj/2gsim/usr/ate/hibiscus_tmp//sea_network/1504762574024/test.toolbox -r /proj/lmdoste/tools/steroot -s --tcctrl AUTO -d 2
0 50167 11830 11815  0  59 20 3720 2504 60031e9e138 S pts/1     0:00 bash /proj/lmdoste/tools/steroot/apps/toolbox/bscste_toolbox -n /proj/2gsim/usr/ate/hibiscus_tmp//sea_network/1504762574024/test.toolbox -r /proj/lmdoste/tools/steroot -s --tcctrl AUTO -d 2
0 50167 19256 13040  0  49 20 1768 1408 600278eebbc S pts/2     0:00 grep -i sea

10:08 tsstool@selilsx592[133]/proj/2gsim/usr/qzbiwis/ATE/hanging_process/selilsx592> pflags 2673 
2673:   /proj/lmdoste/tools/steroot/apps/sea/sea_R38A/lib/sea/bin/64bit/SEA -l
        data model = _LP64  flags = ORPHAN|RLC|MSACCT|MSFORK
        sigpend = 0x00004001,0x00000000
/1:    flags = DSTOP
        sigmask = 0x00000004,0x00000000

10:08 tsstool@selilsx592[131]/proj/2gsim/usr/qzbiwis/ATE/hanging_process/selilsx592> ps -l 2673 
usage: ps [ -aAdeflcjLPyZ ] [ -o format ] [ -t termlist ]
        [ -u userlist ] [ -U userlist ] [ -G grouplist ]
        [ -p proclist ] [ -g pgrplist ] [ -s sidlist ] [ -z zonelist ]
  'format' is one or more of:
        user ruser group rgroup uid ruid gid rgid pid ppid pgid sid taskid ctid
        pri opri pcpu pmem vsz rss osz nice class time etime stime zone zoneid
        f s c lwp nlwp psr tty addr wchan fname comm args projid project pset

10:04 tsstool@selilsx592[126]/proj/2gsim/usr/qzbiwis/ATE/hanging_process/selilsx592> pstack 2673
pstack: cannot examine 2673: process is traced

last pid: 19355;  load averages:  0.02,  0.02,  0.02                                                                                                                                    10:07:10
71 processes:  70 sleeping, 1 on cpu
CPU states:     % idle,     % user,     % kernel,     % iowait,     % swap
Memory: 16G real, 12G free, 1470M swap in use, 17G swap free

   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
19353 tsstool    1  38    0 3256K 2368K cpu/3    0:00  0.08% top
13120 tsstool   23  59    0  163M   88M sleep    6:12  0.04% java
   363 root       1  59    0   48M   46M sleep   29:22  0.01% .vasd
   362 root       1  59    0   32M   28M sleep   81:25  0.01% .vasd
   361 root       1  59    0   34M   29M sleep   27:56  0.01% .vasd
13068 tsstool    1  59    0   11M 5712K sleep    0:04  0.01% sshd
   365 root       1  59    0   14M   11M sleep    2:27  0.00% .vasd
   555 root       2  59    0 7600K 4616K sleep   49:50  0.00% automountd
   355 root       1  59    0   15M   12M sleep    5:19  0.00% .vasd
   181 root      31  59    0 9256K 7128K sleep    3:22  0.00% nscd
   452 daemon     1  59    0 8712K 4536K sleep    2:37  0.00% .vasypd
   874 op5nrpe    1  59    0 5528K 2000K sleep    1:16  0.00% nrpe
   515 root       1 100  -20 3232K 2096K sleep    0:52  0.00% xntpd
     1 root       1  59    0 3128K 2272K sleep    0:46  0.00% init
  2673 tsstool    1  59    0 1089M  353M sleep    0:44  0.00% SEA

10:04 tsstool@selilsx592[124]/proj/2gsim/usr/qzbiwis/ATE/hanging_process/selilsx592> psig -n 2673 
2673:   /proj/lmdoste/tools/steroot/apps/sea/sea_R38A/lib/sea/bin/64bit/SEA -l
HUP     caught  0xffffffff4004a500      RESTART
INT     caught  0xffffffff4004a500      RESTART
QUIT    blocked,caught  0xffffffff4004a500      RESTART
ILL     caught  0xffffffff4004a500      RESTART
TRAP    caught  0xffffffff4004a500      RESTART
ABRT    caught  0xffffffff4004a500      RESTART
EMT     default
FPE     caught  0xffffffff4004a500      RESTART
KILL    default
BUS     caught  0xffffffff4004a500      RESTART
SEGV    caught  0xffffffff4004a500      RESTART
SYS     default
PIPE    ignored
ALRM    default
TERM    caught  0xffffffff4004a500      RESTART
USR1    default
USR2    default
CLD     caught  0x100010ae0     RESTART,SIGINFO
PWR     default
WINCH   default
URG     default
POLL    default
STOP    default
TSTP    default
CONT    default
TTIN    default
TTOU    default
VTALRM  default
PROF    default
XCPU    default
XFSZ    default
WAITING default
LWP     default
FREEZE  default
THAW    default
CANCEL  default
LOST    default
XRES    default
JVM1    default
JVM2    default
RTMIN   default
RTMIN+1 default
RTMIN+2 default
RTMIN+3 default
RTMAX-3 default
RTMAX-2 default
RTMAX-1 default
RTMAX   default
10:04 tsstool@selilsx592[125]/proj/2gsim/usr/qzbiwis/ATE/hanging_process/selilsx592> pargs 2673 
2673:   /proj/lmdoste/tools/steroot/apps/sea/sea_R38A/lib/sea/bin/64bit/SEA -l INIT.sco
argv[0]: /proj/lmdoste/tools/steroot/apps/sea/sea_R38A/lib/sea/bin/64bit/SEA
argv[1]: -l
argv[2]: INIT.sco,AXEMANAGER.sco,SEAGUI.sco,RTS.sco,SCHR.sco
argv[3]: -p
argv[4]: 5001
argv[5]: -e
argv[6]: node-name

答案1

/1: 标志 = DSTOP
……
pstack:无法检查 2673:进程已被跟踪

停止调试或跟踪进程。然后确保它正在运行prun。然后再试一次。

相关内容