Bug 136154 - threads.c:273 (vgCallgrind_post_signal): Assertion '*(vgCallgrind_current_fn_stack.top) == 0' failed.
Summary: threads.c:273 (vgCallgrind_post_signal): Assertion '*(vgCallgrind_current_fn_...
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: callgrind (other bugs)
Version First Reported In: 3.2.1
Platform: Compiled Sources Linux
: NOR crash
Target Milestone: blocking3.5.0
Assignee: Josef Weidendorfer
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-10-23 00:05 UTC by Maxim Egorushkin
Modified: 2009-07-02 01:58 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Maxim Egorushkin 2006-10-23 00:05:25 UTC
Callgrind seems to be choking on writing to a pipe in a signal handler under
some conditions. Here is how to reproduce the behavior (if N is set to, say 16,
another assertion shows up):

[max@k-pax test]$ cat test.cc
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <signal.h>

int const N = 8;

int sig_pipe[2];

void sig_hand(int) { write(sig_pipe[1], "", 1); }

int main()
{
    if(pipe(sig_pipe))
        abort();

    sigset_t mask;
    sigemptyset(&mask);

    for(int signo = SIGRTMIN; signo != SIGRTMIN + N; ++signo)
    {
        struct sigaction sa;
        sa.sa_flags = 0;
        sa.sa_handler = sig_hand;
        sigemptyset(&sa.sa_mask);
        if(sigaction(signo, &sa, 0))
            abort();
        
        sigset_t m;
        sigemptyset(&m);
        sigaddset(&m, signo);
        sigaddset(&mask, signo);
        if(pthread_sigmask(SIG_BLOCK, &m, 0))
            abort();
        raise(signo);
    }

    if(pthread_sigmask(SIG_UNBLOCK, &mask, 0))
        abort();

    char buf[N];
    for(ssize_t n = 0, m = 0; n != N; n += m)
        if(0 > (m = read(sig_pipe[0], buf + n, sizeof buf - n)))
            abort();
}
[max@k-pax test]$ g++ -Wall -Wextra -pthread -g -o test test.cc
[max@k-pax test]$ ./test && valgrind -v --tool=callgrind ./test
==5749== Callgrind, a call-graph generating cache profiler.
==5749== Copyright (C) 2002-2006, and GNU GPL'd, by Josef Weidendorfer et al.
==5749== Using LibVEX rev 1658, a library for dynamic binary translation.
==5749== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
==5749== Using valgrind-3.2.1, a dynamic binary instrumentation framework.
==5749== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
==5749== 
--5749-- Command line
--5749--    ./test
--5749-- Startup, with flags:
--5749--    -v
--5749--    --tool=callgrind
--5749-- Contents of /proc/version:
--5749--   Linux version 2.6.17-1.2187_FC5
(brewbuilder@hs20-bc2-2.build.redhat.com) (gcc version 4.1.1 20060525 (Red Hat
4.1.1-1)) #1 Mon Sep 11 01:17:06 EDT 2006
--5749-- Arch and hwcaps: X86, x86-sse1-sse2
--5749-- Valgrind library directory: /usr/local/lib/valgrind
==5749== For interactive control, run 'callgrind_control -h'.
--5749-- Reading syms from /lib/ld-2.4.so (0x526000)
--5749-- Reading syms from /home/max/src/test/test (0x8048000)
--5749-- Reading syms from /usr/local/lib/valgrind/x86-linux/callgrind (0x38000000)
--5749--    object doesn't have a dynamic symbol table
--5749-- Code check found runtime_resolve: ld-2.4.so +0x12B60=0x538B60, length 24
--5749-- Reading syms from /usr/local/lib/valgrind/x86-linux/vgpreload_core.so
(0x4001000)
--5749-- Reading syms from /usr/lib/libstdc++.so.6.0.8 (0x3E7000)
--5749--    object doesn't have a symbol table
--5749-- Reading syms from /lib/libm-2.4.so (0xDB3000)
--5749-- Reading syms from /lib/libgcc_s-4.1.1-20060525.so.1 (0x3D9000)
--5749--    object doesn't have a symbol table
--5749-- Reading syms from /lib/libpthread-2.4.so (0x3C3000)
--5749-- Reading syms from /lib/libc-2.4.so (0x101000)
--5749-- Symbol match: found runtime_resolve: ld-2.4.so +0x538B60=0x538B60
sig_hand(int)
BB# 303202

Callgrind: threads.c:273 (vgCallgrind_post_signal): Assertion
'*(vgCallgrind_current_fn_stack.top) == 0' failed.
==5749==    at 0x38019C11: report_and_quit (m_libcassert.c:136)
==5749==    by 0x38019F3B: vgPlain_assert_fail (m_libcassert.c:200)
==5749==    by 0x3801795D: vgCallgrind_post_signal (threads.c:273)
==5749==    by 0x3805ACC7: vgPlain_sigframe_destroy (sigframe-x86-linux.c:694)
==5749==    by 0x3805D89B: vgSysWrap_x86_linux_sys_sigreturn_before
(syswrap-x86-linux.c:954)
==5749==    by 0x3804D75F: vgPlain_client_syscall (syswrap-main.c:719)
==5749==    by 0x38039E20: vgPlain_scheduler (scheduler.c:721)
==5749==    by 0x38058D53: run_a_thread_NORETURN (syswrap-linux.c:87)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==5749==    at 0x3CDD93: __write_nocancel (in /lib/libpthread-2.4.so)
==5749==    by 0x3CF0C7: (within /lib/libpthread-2.4.so)
==5749==    by 0x3CF0C7: (within /lib/libpthread-2.4.so)
==5749==    by 0x3CF0C7: (within /lib/libpthread-2.4.so)
==5749==    by 0x80487C3: main (test.cc:38)


Note: see also the FAQ.txt in the source distribution.
It contains workarounds to several common problems.

If that doesn't help, please report this bug to: www.valgrind.org

In the bug report, send all the above text, the valgrind
version, and what Linux distro you are using.  Thanks.
Comment 1 Nicholas Nethercote 2009-06-30 06:45:03 UTC
I'm closing crashing and similar bugs that are more than two years old.  If 
you still see this problem with Valgrind 3.4.1 please reopen the bug report.
Thanks.
Comment 2 Josef Weidendorfer 2009-06-30 20:18:11 UTC
Hi Nick,

I just had I look at this. I can reproduce it from the test case.
Comment 3 Nicholas Nethercote 2009-07-01 02:19:29 UTC
(In reply to comment #2)
> 
> I just had I look at this. I can reproduce it from the test case.

Good.  Should it be marked wanted3.5.0 or blocking3.5.0?
Comment 4 Josef Weidendorfer 2009-07-01 22:06:57 UTC
I am optimistic: blocking3.5.0
Comment 5 Josef Weidendorfer 2009-07-01 22:31:50 UTC
One step closer to the fix:

It looks like the problem is that Callgrind gets the case
wrong where more than 1 signals are delivered in a row to a process
where no guest instruction is executed inbetween these 2 signal
deliveries.

The provided test case produces this scenario 3 times (3x two signals
in a row) here, and the failed assertion happens after post_signal() of
the second signal of such a "tightly coupled signal tuple" ;-)
Comment 6 Josef Weidendorfer 2009-07-02 01:58:18 UTC
Fixed in r10399.