Bug 282018 - Memcheck leak checking is not friendly to fork-exec idiom if exec fails in the presence of threads
Summary: Memcheck leak checking is not friendly to fork-exec idiom if exec fails in th...
Status: REPORTED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: 3.7 SVN
Platform: Unlisted Binaries Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-14 14:31 UTC by Timur Iskhodzhanov
Modified: 2011-09-14 17:38 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Timur Iskhodzhanov 2011-09-14 14:31:39 UTC
===leak_fork.c===
#include <assert.h>
#include <malloc.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

void* thread_func(void *p) {
  int *foo = (int*)malloc(sizeof(int));
  usleep(1000000);  /* Make sure the fork() happens while we're */
  free(foo);
  return NULL;
}

int main(void) {
  pthread_t thr;
  pid_t child_pid;
  int rv = pthread_create(&thr, NULL, thread_func, NULL);
  assert(rv == 0); 
  usleep(500000);  /* Make sure the thread has started */

  child_pid = fork();
  assert(child_pid >= 0); 

  if (child_pid == 0) {
    execlp("ehco", "hello", NULL);  
    exit(1);  /* exec failed - Never mind, I'll buy you a new one */
  }

  rv = pthread_join(thr, NULL);
  assert(rv == 0); 
}
=================
$ gcc -lpthread -g leak_fork.c && valgrind --leak-check=full ./a.out # valgrind r12034 built from sources
==20044== Memcheck, a memory error detector
==20044== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==20044== Using Valgrind-3.7.0.SVN and LibVEX; rerun with -h for copyright info
==20044== Command: ./a.out
...
==20046== 4 bytes in 1 blocks are definitely lost in loss record 1 of 1
==20046==    at 0x4C27B4A: malloc (vg_replace_malloc.c:263)
==20046==    by 0x400809: thread_func (leak_fork.c:9)
==20046==    by 0x4E359C9: start_thread (pthread_create.c:300)
==20046==    by 0x5BCF6FF: ???
...
==20046== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
...
==20044== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)

Please note that fork-and-immediately-exec is a correct pattern even in threaded environment.
This has come up from the in-the-field report on Chromium browser test:
http://code.google.com/p/chromium/issues/detail?id=46163
Comment 1 Timur Iskhodzhanov 2011-09-14 14:35:46 UTC
Also, the same happens if I replace "exit(1);" with "abort();"

Possible workarounds:
a) at the fork time, mark all known allocations as "reachable" / "remembered"
b) at the fork time & if threads are present, set a flag to skip leak checking in the child process
c) add a memcheck client request to avoid leak checking and call it before exit(1)
d) ignore leak checking on "abort();"
Comment 2 Timur Iskhodzhanov 2011-09-14 14:46:02 UTC
Self-correction:
a) at the fork time, mark all known allocations as "reachable" / "remembered"
   in the child process.
  Side effect: may miss some child-only leaks
b) at the fork time & if threads are present, set a flag to skip leak checking
   in the child process.
  Maybe worth adding a "if exec is called and failed" condition
c) ... before exit(nonzero)
...
e) skip leak checking after exec failure
Comment 3 Derek Bruening 2011-09-14 17:38:10 UTC
I'm going to assume that in the fork child Memcheck goes and marks the stacks for now-dead threads as unaddressable and that's why its leak scan fails to find formerly-local vars in memory (which is the case for this test app compiled w/ default flags).  Why not mark the bogus stacks as unaddressable for most purposes, but treat as defined (up to former-top-of-stack) for the leak scan itself?  You'll still have a risk of reporting a leak that was anchored only in a register but that may be rare enough.  OTOH, if a multi-threaded app forks and never execs, reporting these leaks seems the right thing, so perhaps trigger this defined-for-scan on the presence of a failed exec.