Bug 356374 - Assertion 'DRD_(g_threadinfo)[tid].pt_threadid != INVALID_POSIX_THREADID' failed
Summary: Assertion 'DRD_(g_threadinfo)[tid].pt_threadid != INVALID_POSIX_THREADID' failed
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: drd (show other bugs)
Version: unspecified
Platform: Ubuntu Linux
: NOR crash
Target Milestone: ---
Assignee: Bart Van Assche
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-07 20:36 UTC by Jonathan Rajotte Julien
Modified: 2018-05-16 19:56 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Test case (876 bytes, application/octet-stream)
2015-12-07 20:38 UTC, Jonathan Rajotte Julien
Details
another program cashing Valgrind (1.62 KB, text/x-csrc)
2017-01-30 09:49 UTC, David Monniaux
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Rajotte Julien 2015-12-07 20:36:32 UTC
Original message on the mailing list and addition after:

==============================================================================

Hi,

I was trying to use the DRD tool on one of our instrumented application (tracing via lttng) and I get an assertion [4] similar to the one from this post [1].
Just like Joe VanAdel I'm wondering whether the problem is on our side (lttng) or DRD.

The lttng-ust lib essentially get preloaded and create a thread during it's initialization.
You can find the relevant code for lttng-ust here [3].

A way to reproduce this is to install lttng-ust [3] and use drd on
the example under lttng-ust/doc/examples/easy-ust/

Let me know if you need any additional information.

Thanks

[1] http://thread.gmane.org/gmane.comp.debugging.valgrind/13995/focus=13996

[2] https://github.com/lttng/lttng-ust

It require liburcu: https://github.com/urcu/userspace-rcu

[3]
The init function : https://github.com/lttng/lttng-ust/blob/master/liblttng-ust/lttng-ust-comm.c#L1483
The actual call to pthread_create: https://github.com/lttng/lttng-ust/blob/master/liblttng-ust/lttng-ust-comm.c#L1560

[4]

==16669== drd, a thread error detector
==16669== Copyright (C) 2006-2015, and GNU GPL'd, by Bart Van Assche.
==16669== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==16669== Command: ./sample
==16669==
--16669-- WARNING: unhandled amd64-linux syscall: 324
--16669-- You may be able to write your own handler.
--16669-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--16669-- Nevertheless we consider this a bug.  Please report
--16669-- it at http://valgrind.org/support/bug_reports.html.

drd: drd_thread.c:657 (vgDrd_thread_entering_pthread_create): Assertion 'DRD_(g_threadinfo)[tid].pt_threadid != INVALID_POSIX_THREADID' failed.

host stacktrace:
==16669==    at 0x380256C8: show_sched_status_wrk (m_libcassert.c:343)
==16669==    by 0x380257D4: report_and_quit (m_libcassert.c:415)
==16669==    by 0x38025961: vgPlain_assert_fail (m_libcassert.c:481)
==16669==    by 0x38014E36: vgDrd_thread_entering_pthread_create (drd_thread.c:657)
==16669==    by 0x3800944B: handle_client_request (drd_clientreq.c:296)
==16669==    by 0x3803D040: wrap_tool_handle_client_request (m_tooliface.c:280)
==16669==    by 0x3807526F: do_client_request (scheduler.c:2101)
==16669==    by 0x3807526F: vgPlain_scheduler (scheduler.c:1425)
==16669==    by 0x380839EA: thread_wrapper (syswrap-linux.c:102)
==16669==    by 0x380839EA: run_a_thread_NORETURN (syswrap-linux.c:155)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable (lwpid 16669)
==16669==    at 0x4C3167B: vgDrd_entering_pthread_create (drd_pthread_intercepts.c:441)
==16669==    by 0x4C3167B: pthread_create_intercept (drd_pthread_intercepts.c:594)
==16669==    by 0x4C3167B: pthread_create@* (drd_pthread_intercepts.c:611)
==16669==    by 0x50696BD: lttng_ust_init (lttng-ust-comm.c:1560)
==16669==    by 0x4010139: call_init.part.0 (dl-init.c:78)
==16669==    by 0x4010222: call_init (dl-init.c:36)
==16669==    by 0x4010222: _dl_init (dl-init.c:126)
==16669==    by 0x4001309: ??? (in /lib/x86_64-linux-gnu/ld-2.19.so) 

==============================================================================

After some experimentation I was able to create a small an simple application to reproduce the bug.

The scenario is:
- Open a shared lib via dlopen.
- The shared lib spawn a pthread on a symbol call

When using drd on the executable it fail with the previous assertion.

See attachment for the sample program.

Reproducible: Always

Steps to Reproduce:
1. Untar the attachment.
2. Run make.
3. Run 'make valgrind' or run 'valgrind --tool=drd ./dlopen-test'.



uname -a for 2 machine with same version of valgrind (valgrind-3.11.0):

inux psrcode-TP-X230 4.3.0-040300-generic #201511020949 SMP Mon Nov 2 14:50:44 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
3.13.0-71-generic #114-Ubuntu SMP Tue Dec 1 02:34:22 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Comment 1 Jonathan Rajotte Julien 2015-12-07 20:38:12 UTC
Created attachment 95929 [details]
Test case

Run 'make' or 'make valgrind'
Comment 2 David Monniaux 2017-01-30 09:49:23 UTC
Created attachment 103715 [details]
another program cashing Valgrind

Same issue here with Valgrind 3.10, 3.11, 3.12 and this program using multithreaded OpenBLAS.

Compile with
gcc -std=c99 -Wall -O3 blas3.c -o blas3 -l openblas
Comment 3 Julian Seward 2017-05-08 14:28:12 UTC
Bart, ping?
Comment 4 Bart Van Assche 2017-05-09 04:47:54 UTC
Sorry for the delay. r16342 should fix this bug.