Version: 4.3.1 (using KDE 4.3.1) OS: FreeBSD Installed from: FreeBSD Ports During regular operation kmail produces a zombie once in a while. I have been having this issue on with multiple FreeBSD-setups since the beginning of KDE4, but just now found out that PPID tells where a process comes from and could therefore track it to kmail. It looks like this (second column is PID, third column PPID): hannes@fbsdmain /c/h/hannes> ps ajx | grep 1600 hannes 1600 1 1123 1123 0 I ?? 1:10,02 /usr/local/kde4/bin/kmail -caption Kmail hannes 1606 1600 1123 1123 0 Z ?? 0:00,00 <defunct> hannes 1608 1600 1123 1123 0 Z ?? 0:00,00 <defunct> hannes 1610 1600 1123 1123 0 Z ?? 0:00,00 <defunct> hannes 1612 1600 1123 1123 0 Z ?? 0:00,00 <defunct> hannes 1696 1600 1123 1123 0 Z ?? 0:00,00 <defunct> hannes 1702 1600 1123 1123 0 Z ?? 0:00,00 <defunct> hannes 15496 1600 1123 1123 0 Z ?? 0:00,00 <defunct> hannes 15917 1600 1123 1123 0 Z ?? 0:00,00 <defunct> hannes 21038 1267 21037 1267 2 S+ 3 0:00,00 grep 1600 Thanks for your help!
It would be important to find out which subprocess becomes a zombie. E.g. do you use an external editor ? Does any filter call an external program ? Do you have any mail-fetching precommand using an external program ? Anything else what runs an external program from kmail ?
We had a discussion on kde-freebsd about this. It was confirmed by many others, also with more details. HTH On Thursday 12 November 2009 03:55:55 Mel Flynn wrote: > On Wednesday 11 November 2009 18:38:04 Hannes wrote: > > Hi everybody! > > > > Can anyone confirm that Kmail creates broken process every once in while > > on FreeBSD? I have the issue on multiple machines and reported it now[1] > > > > Thanks for your help! > > > > Regards, > > Hannes > > > > [1] http://bugs.kde.org/show_bug.cgi?id=214140 > > Yes. I think I saw a bugreport about it, not specific to FreeBSD but to > IMAP, yet I can't find it now. > > Anyway, pstree snippet: > |-+- 22969 mel /usr/local/kde4/bin/kontact > | > | |--- 22987 mel <defunct> > | |--- 22989 mel <defunct> > | |--- 22991 mel <defunct> > | |--- 22993 mel <defunct> > | |--- 22995 mel <defunct> > | |--- 22999 mel <defunct> > | |--- 23015 mel <defunct> > | > | \--- 23035 mel <defunct> > Can confirm this bug. Produced by GPG plugin on every new signature (some letters from one author not produce new zombies, only one)
I've done some investigation using ktrace and switching between signed and unsigned mails that causes new zombies creation. Seems that Kmail starts gpg2 in background using double fork invocation, and the thread forked first becomes zombie. Here is the trace: 31180 kmail CALL fork 31180 kmail RET fork 31843/0x7c63 <-- 31843 will become zombie 31843 kmail RET fork 0 31843 kmail CALL thr_self(0x82150c400) 31843 kmail RET thr_self 0 31180 kmail CALL sigprocmask(SIG_SETMASK,0x82150c4e8,0) 31180 kmail RET sigprocmask 0 31180 kmail CALL wait4(0x7c63,0x7ffffedf6b6c,0<><invalid>0,0) 31843 kmail CALL getpid 31843 kmail RET getpid 31843/0x7c63 <-- 31843 verifies if it is forked process ..... 31843 kmail CALL fork <-- 31843 forks again .... 31843 kmail RET fork 31844/0x7c64 31844 kmail RET fork 0 <-- 31844 is child of 31843 .... 31844 kmail CALL getpid 31844 kmail RET getpid 31844/0x7c64 ... then 31844 closes many file descriptors ... 31843 kmail CALL exit(0) <-- 31843 calls exit and becomes zombie (determined by ps later) ... then 31844 closes more file descriptors ... 31844 kmail CALL open(0x80c701043,0x2<O_RDWR>,<unused>0) <-- 31844 executes gpg2 31844 kmail NAMI "/dev/null" 31844 kmail RET open 0 31844 kmail CALL dup2(0,0x2) 31844 kmail RET dup2 2 31844 kmail CALL execve(0x86b49e3e0,0x86b4613c0,0x81b4e2600) 31844 kmail NAMI "/usr/local/bin/gpg2" 31844 kmail NAMI "/libexec/ld-elf.so.1" 31844 gpg2 RET execve 0 ... then gpg2 does its job ... 31844 gpg2 CALL exit(0) <-- gpg2 exits
It seems to be gpgme bug actually - not KDE one. with the following patch applied to gpgme it stops producing zombies. Not sure if it is actually correct. --- src/posix-io.c.orig 2012-09-25 17:46:40.000000000 +0400 +++ src/posix-io.c 2014-04-08 01:51:56.000000000 +0400 @@ -340,10 +340,15 @@ _gpgme_io_waitpid (int pid, int hang, int *r_status, int *r_signal) { int status; - + int ret; *r_status = 0; *r_signal = 0; - if (_gpgme_ath_waitpid (pid, &status, hang? 0 : WNOHANG) == pid) + do + { + ret = _gpgme_ath_waitpid (pid, &status, hang? 0 : WNOHANG); + } + while (ret == -1 && errno == EINTR); + if (ret == pid) { if (WIFSIGNALED (status)) {
If unsure, you could discuss it at an gpgme related list or forum.
Already done: https://bugs.g10code.com/gnupg/issue1630 Just forgot to post the link here.
Bug was fiexd in gpgme. Fix will be released in gpgme 1.5.0 commited diff: http://git.gnupg.org/cgi-bin/gitweb.cgi?p=gpgme.git;a=commitdiff;h=2bb26185e3b9a048033c559517d6ba7d2eb47066;hp=d3bd8fff863f62b6d0e228aea754efbbde861e9a
Thanks for the heads-up, marking as resolved. Please add a comment, if you see this issue again with gpgme 1.5.0 or later.