Bug 101291 - creating threads in a forked process fails
Summary: creating threads in a forked process fails
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 2.4 CVS
Platform: Compiled Sources Linux
: NOR crash
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-03-11 11:10 UTC by Patrick Ohly
Modified: 2005-03-12 00:05 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick Ohly 2005-03-11 11:10:17 UTC
When running the program below under valgrind 2.4.0 rc2,
valgrind aborts with an assert:

cc atfork.c -o atfork -lpthread
valgrind --tool=none ./atfork
[...]

valgrind: vg_scheduler.c:471 (run_thread_for_a_while): Assertion
`!_qq_tst->sched_jmpbuf_valid' failed.
==7903==    at 0xB0035AA7: vgPlain_skin_assert_fail (vg_mylibc.c:1170)
==7903==    by 0xB0035AA6: assert_fail (vg_mylibc.c:1166)
==7903==    by 0xB0035B01: vgPlain_core_assert_fail (vg_mylibc.c:1177)
==7903==    by 0xB0019B32: run_thread_for_a_while (vg_scheduler.c:483)
==7903==    by 0xB001A16A: vgPlain_scheduler (vg_scheduler.c:712)
==7903==    by 0xB007E3DD: vgArch_thread_wrapper (core_os.c:69)
==7903==    by 0xB007BFAB: start_thread (syscalls.c:240)
==7903==    by 0xB007BB2F: (within
/Projects/software/IA32-LIN/valgrind-2.4.0-rc2/libc6-2.3.2/lib/valgrind/stage2)

sched status:
  running_tid=2

Thread 1: status = VgTs_Yielding
==7903==    at 0x3AA7EE7C: clone (in /lib/tls/libc-2.3.2.so)
==7903==    by 0x3A997497: create_thread (in /lib/tls/libpthread-0.60.so)
==7903==    by 0x3A996F80: pthread_create@@GLIBC_2.1 (in
/lib/tls/libpthread-0.60.so)
==7903==    by 0x80486D2: main (in
/Projects/psp/pohly/src/ict/tracing/vampirtrace/test/atfork)

Removing the pthread_create/join from "case 0" let's the program run
normally under valgrind.

System: x86 + RH EL3.0 (glibc 2.3.2)
uname -a:
Linux knscsl004.ikn.intel.com 2.4.21-15.ELsmp #1 SMP Thu Apr 22 00:18:24 EDT
2004 i686 i686 i386 GNU/Linux
valgrind --version:
valgrind-2.4.0.rc2

Another remark: the same program also shows a problem with
valgrind's 2.2.0 pthread library. It only calls afterfork()
in the first child process, but not in the second one.
I don't think this needs any further attention, with 2.4.0
just around the corner...

--------------- atfork.c ------------------------
#include <pthread.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <signal.h>

static void afterfork( void )
{
    fprintf( stderr, "child with pid %d: after fork\n", getpid() );
}

static void *threadmain( void *dummy )
{
    fprintf( stderr, "thread alive: pid %d\n", getpid() );
    sleep( (int)dummy );
    return NULL;
}

int main( int argc, char **argv )
{
    pid_t childpid;
    pthread_t childthread;
    int i;
    void *res;

    fprintf( stderr, "master: pid %d\n", getpid() );
    pthread_create( &childthread, NULL, threadmain, (void *)60 );
    pthread_atfork( NULL, NULL, afterfork );

    for( i = 0; i < 2; i++ ) {
        childpid = fork();
        switch( childpid ) {
        case 0:
            fprintf( stderr, "child %d: I'm alive\n", i );
            pthread_create( &childthread, NULL, threadmain, 0 );
            pthread_join( childthread, &res );
            exit(0);
            break;
        case -1:
            fprintf( stderr, "fork %d failed\n", i );
            break;
        default:
            fprintf( stderr, "child %d: pid %d\n", i, childpid );
            break;
        }
    }

    pthread_kill( childthread, SIGHUP );
    pthread_join( childthread, &res );

    return 0;
}

--------------- atfork.c ------------------------
Comment 1 Jeremy Fitzhardinge 2005-03-11 11:29:28 UTC
Interesting.  Looking at it.
Comment 2 Jeremy Fitzhardinge 2005-03-12 00:05:42 UTC
OK, I think this checkin should fix it:

When a multi-threaded program forks(), only the thread actually
calling fork() appears in the child.  The child Valgrind will inherit a
VG_(threads) array which still describes the other threads.  The code in
vg_scheduler:sched_fork_cleanup is responsible for doing this, but it was
only "killing" the other threads by setting their statuses to VgTs_Empty.

This was causing confusion if the child later created other threads
and found partially initialized threads structures.  This change
makes sched_fork_cleanup fully reinitialize the other thread slots
in VG_(threads).