-- begin file system.c -- #include <stdlib.h> #include <pthread.h> void *thread(void *args) { system("ls -l"); return NULL; } int main(int argc, char *argv[]) { #if NO_THREAD thread(NULL); #else pthread_t th; pthread_create(&th, NULL, thread, NULL); pthread_join(th, NULL); #endif return 0; } -- end file system.c -- Compile and valgrind output with one extra thread: gcc -Wall -g -pthread system.c -o bug_one_thread ~/test/valgrind_bug1> valgrind --db-attach=yes --tool=memcheck ./bug_one_thread ==23413== Memcheck, a memory error detector for x86-linux. ==23413== Copyright (C) 2002-2004, and GNU GPL'd, by Julian Seward et al. ==23413== Using valgrind-2.1.2, a program supervision framework for x86-linux. ==23413== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et al. ==23413== For more details, rerun with: -v ==23413== ==23416== Thread 2: ==23416== Syscall param execve(envp) contains uninitialised or unaddressable byte(s) ==23416== at 0x1B9F82A7: execve (in /lib/libc.so.6) ==23416== by 0x1B98A442: do_system (in /lib/libc.so.6) ==23416== by 0x1B910C96: system (vg_libpthread.c:2526) ==23416== by 0x8048446: thread (system.c:5) ==23416== Address 0x52BFE41C is not stack'd, malloc'd or (recently) free'd ==23416== ==23416== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- ==23416== ==23416== Thread 2: ==23416== Syscall param execve(envp[i]) contains uninitialised or unaddressable byte(s) ==23416== at 0x1B9F82A7: execve (in /lib/libc.so.6) ==23416== by 0x1B98A442: do_system (in /lib/libc.so.6) ==23416== by 0x1B910C96: system (vg_libpthread.c:2526) ==23416== by 0x8048446: thread (system.c:5) ==23416== Address 0x52BFE5BD is not stack'd, malloc'd or (recently) free'd ==23416== ==23416== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- total 44 -rwxr-xr-x 1 seiderer users 16426 2004-07-21 14:54 bug_no_thread -rwxr-xr-x 1 seiderer users 16720 2004-07-21 14:54 bug_one_thread -rw-r--r-- 1 seiderer users 283 2004-07-21 14:51 system.c ==23413== ==23413== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 16 from 1) ==23413== malloc/free: in use at exit: 216 bytes in 2 blocks. ==23413== malloc/free: 5 allocs, 3 frees, 584 bytes allocated. ==23413== For a detailed leak analysis, rerun with: --leak-check=yes ==23413== For counts of detected errors, rerun with: -v Compile and valgrind output without extra thread: gcc -Wall -g -pthread -DNO_THREAD system.c -o bug_no_thread ~/test/valgrind_bug1> valgrind --db-attach=yes --tool=memcheck ./bug_no_thread ==23425== Memcheck, a memory error detector for x86-linux. ==23425== Copyright (C) 2002-2004, and GNU GPL'd, by Julian Seward et al. ==23425== Using valgrind-2.1.2, a program supervision framework for x86-linux. ==23425== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et al. ==23425== For more details, rerun with: -v ==23425== total 44 -rwxr-xr-x 1 seiderer users 16426 2004-07-21 14:54 bug_no_thread -rwxr-xr-x 1 seiderer users 16720 2004-07-21 14:54 bug_one_thread -rw-r--r-- 1 seiderer users 283 2004-07-21 14:51 system.c ==23425== ==23425== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 16 from 1) ==23425== malloc/free: in use at exit: 216 bytes in 2 blocks. ==23425== malloc/free: 2 allocs, 0 frees, 216 bytes allocated. ==23425== For a detailed leak analysis, rerun with: --leak-check=yes ==23425== For counts of detected errors, rerun with: -v Is this a problem of valgrind or a problem of the pthread library?
Both versions of the testprogramm produce no error report when valgrind-2.0.0 is used.
It may well be a Valgrind bug; I think Valgrind might crash when execve(foo, NULL, NULL) is executed; strictly speaking I think that's not allowed but the normal execution works ok.
In message <20040721141954.12511.qmail@ktown.kde.org> Nicholas Nethercote <njn25@cam.ac.uk> wrote: > It may well be a Valgrind bug; I think Valgrind might crash when > execve(foo, NULL, NULL) is executed; strictly speaking I think that's not > allowed but the normal execution works ok. I think I fixed that didn't I? I think this is something else. Tom
Seemingly not, this program segfaults for me: #include <stdlib.h> #include <unistd.h> int main(void) { execve("/bin/true", NULL, NULL); return 0; }
I was thinking of bug 83573 which apparently only dealt with envp being a NULL pointer. Is it even legal for argv to be NULL though? Don't you have to at least provide an argv[0] with the program name?
I've fixed that execve() problem now but the problem with system when called in a second thread is still occurring.
I think the program name goes in the first argument to execve(), and the argv[] array holds the rest.
You normally give the program name to execve() twice. The first argument is the name of the program that the kernel will start, but if you want argv[0] to be correct in the started program then you need to provide argv[0] in the call to execve.
*** Bug 92763 has been marked as a duplicate of this bug. ***
Right. I think I understand what is happening here. The environment is held on the processes initial stack, which is the stack of the main thread. When a process forks valgrind kills all the threads other than the one which does the fork, just as POSIX says it should. Unfortunately when it kills those threads it also marks their stacks as inaccesible. So if you fork in a thread other than the main thread then the main thread's stack (and hence the environment) become inaccessible in the child process. I suspect that the correct fix is not to mark the stacks as inaccesible as they do still exist. That's actually an interesting feature of forking when there are multiple threads - you sort of leak memory because all the other threads are killed bu their stack space is still allocated and accessible.
Created attachment 8206 [details] Patch to avoid banning thread stacks on fork This patch changes VG_(nuke_all_threads) to disassociate the the stacks of the threads being killed from the threads rather than marking them as inaccessible. This should fix the problem with the environment (and other data from the stacks of other threads) causing warnings after a fork. I believe that VG_(nuke_all_threads) is only called in places where this is the behaviour that we want or where it doesn't matter because we're about to exit anyway.
I've now committed my patch to the CVS head but I'd be grateful if you could confirm whether or not it seems to fix the problem for you. Thanks.
Thanks for the patch, works for my little test program and the real world application where the error first occured (checked with CVS head).
Works for me as well; thanks for the fix.
Reopening so I can close with the correct resolution.
*** Bug has been marked as fixed ***.