这些 imapd 进程崩溃的原因是什么?

这些 imapd 进程崩溃的原因是什么?

我们有一个 Mac OS X 10.5 Leopard Server 邮件服务器,最近开始出现 IMAP 邮箱问题,该邮箱在本周末出现“格式无效”问题。原来,存放 IMAP 数据的卷上有一些坏块,修复卷和受影响的邮箱后,问题没有再次出现。然而,一个持续存在的新问题是imaps进程频繁崩溃,并且“锁定器”错误不断增加db4,如下所示:

Apr 13 17:01:12 host lmtpunix[31509]: DBERROR db4: 1134 lockers

imaps崩溃进程的错误/var/log/system.log如下:

Apr 12 13:43:10 host imaps[11792]: starttls: TLSv1 with cipher AES128-SHA (128/128 bits new) no authentication
Apr 12 13:43:12 host imaps[11792]: starttls: TLSv1 with cipher AES128-SHA (128/128 bits new) no authentication
Apr 12 13:43:13 host imaps[11792]: login: pool-72-92-XXX-XXX.burl.east.myfairpoint.net [72.92.XXX.XXX] user3 CRAM-MD5+TLS User logged in
Apr 12 13:43:15 host ReportCrash[14362]: Formulating crash report for process imapd[11792]
Apr 12 13:43:15 host master[94896]: process 11792 exited, signaled to death by 11
Apr 12 13:43:15 host ReportCrash[14362]: Saved crashreport to /Library/Logs/CrashReporter/imapd_2011-04-12-134315_host.crash using uid: 0 gid: 0, euid: 0 egid: 0

以下内容来自/var/log/mailaccess.log

Apr 12 13:43:10 host imaps[11792]: accepted connection
Apr 12 13:43:10 host imaps[11792]: mydelete: starting txn 2147495107
Apr 12 13:43:10 host imaps[11792]: mydelete: committing txn 2147495107
Apr 12 13:43:10 host imaps[11792]: mystore: starting txn 2147495108
Apr 12 13:43:10 host imaps[11792]: mystore: committing txn 2147495108
Apr 12 13:43:10 host imaps[11792]: starttls: TLSv1 with cipher AES128-SHA (128/128 bits new) no authentication
Apr 12 13:43:12 host imaps[11792]: accepted connection
Apr 12 13:43:12 host imaps[11792]: mydelete: starting txn 2147495112
Apr 12 13:43:12 host imaps[11792]: mydelete: committing txn 2147495112
Apr 12 13:43:12 host imaps[11792]: mystore: starting txn 2147495113
Apr 12 13:43:12 host imaps[11792]: mystore: committing txn 2147495113
Apr 12 13:43:12 host imaps[11792]: starttls: TLSv1 with cipher AES128-SHA (128/128 bits new) no authentication
Apr 12 13:43:12 host imaps[11792]: AOD: user options: no lookup required for: user3
Apr 12 13:43:13 host imaps[11792]: login: pool-72-92-XXX-XXX.burl.east.myfairpoint.net [72.92.149.161] user3 CRAM-MD5+TLS User logged in
Apr 12 13:43:13 host imaps[11792]: quota set to "unlimited" for mailbox user.user3
Apr 12 13:43:13 host imaps[11792]: open: user user3 opened Other Users/listmaster
Apr 12 13:43:15 host master[94896]: process 11792 exited, signaled to death by 11
Apr 12 13:43:15 host master[94896]: service imaps pid 11792 in BUSY state: terminated abnormally
Apr 12 13:43:15 host master[94896]: process 11792 exited, signaled to death by 11
Apr 12 13:43:15 host master[94896]: service imaps pid 11792 in BUSY state: terminated abnormally

崩溃报告都类似如下:

Process:         imapd [39069]
Path:            /usr/bin/cyrus/bin/imapd
Identifier:      imapd
Version:         ??? (???)
Code Type:       X86 (Native)
Parent Process:  master [38605]

Date/Time:       2011-04-13 18:25:24.068 -0400
OS Version:      Mac OS X Server 10.5.7 (9J61)
Report Version:  6
Anonymous UUID:  223C4DD1-2AE2-4381-8A28-DEB9082281A8

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000077a0ca64
Crashed Thread:  0

Thread 0 Crashed:
0   imapd                               0x0003090c process_records + 588
1   imapd                               0x00031362 mailbox_expunge + 2146
2   imapd                               0x00006fde cmd_close + 179
3   imapd                               0x00018cf8 cmdloop + 2940
4   imapd                               0x0001c1b7 service_main + 1498
5   imapd                               0x00002e73 main + 3502
6   imapd                               0x00002006 start + 54

Thread 0 crashed with X86 Thread State (32-bit):
  eax: 0x61766970  ebx: 0x000306cb  ecx: 0x00000008  edx: 0x77a0ca64
  edi: 0x00bfffa4  esi: 0x162a5fa4  ebp: 0xbfffad48  esp: 0xbfffac90
   ss: 0x0000001f  efl: 0x00010202  eip: 0x0003090c   cs: 0x00000017
   ds: 0x0000001f   es: 0x0000001f   fs: 0x00000000   gs: 0x00000037
  cr2: 0x77a0ca64

是的,它们全都崩溃process_recordsmailbox_expunge

我实际上并没有在日志中看到任何其他错误,至少这些错误似乎与崩溃的进程有任何关联,或者像SQUAT failed to open index file和一样无害IOERROR: fstating sieve script /usr/sieve/u/user/defaultbc: No such file or directory

我必须承认,我还没有重建Other Users/listmaster邮箱,也没有user3重建邮箱。它并不总是同一个用户。

我们确实有一些用户发现已发送的电子邮件没有保存到他们的“已发送邮件”邮箱中,并且自原始问题发生之日起就没有保存过。重建他们的邮箱(目前使用,sudo mailbfr -m username因为它除了修复一些额外的权限之外,还修复了一些额外的权限sudo /usr/bin/cyrus/bin/reconstruct -r user/username我通常会运行这个程序)似乎允许将新发送的电子邮件保存到其中,但我无法找到该问题与这个问题(或日志中的任何其他错误)之间的关联。

任何建议都将不胜感激。尝试删除邮件真的会导致崩溃吗?我应该单独重建所有用户的邮箱吗?我真的不想重建整个 Cyrus 数据库并丢失所有消息的标记/已读状态。

答案1

我认为损坏的块进入了错误的数据库索引,导致在存储新数据时崩溃。除了重建数据库之外,您无能为力。您可以备份用户的 .seen 文件并尝试使用它们,但请在测试用户上测试这个想法。老实说,我认为无论如何都应该尽快将带有坏块的 harrdrive 从服务器中删除

答案2

我很久以前就解决了这个问题。

我不记得具体命令了,但我找到了一种方法,可以合理地将特定崩溃与特定用户关联起来,然后我就可以运行mailbfr -m重建该用户的邮箱。最终,我能够重建所有有问题的邮箱,并消除服务器问题。

相关内容