339160 – Running signal handler with alternate stack allocated on current stack crashes callgrind

Bug 339160 - Running signal handler with alternate stack allocated on current stack crashes callgrind

Summary: Running signal handler with alternate stack allocated on current stack crashe...

Status:	CONFIRMED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	callgrind (show other bugs)
Version:	unspecified
Platform:	Other Linux

Importance:	NOR crash
Target Milestone:	---
Assignee:	Josef Weidendorfer

URL:
Keywords:

Depends on:
Blocks:

Reported:	2014-09-17 23:05 UTC by Josef Weidendorfer
Modified:	2019-09-20 18:14 UTC (History)
CC List:	4 users (show)

See Also:
Latest Commit:
Version Fixed In:

Attachments
test case (708 bytes, text/x-csrc) 2014-09-17 23:05 UTC, Josef Weidendorfer	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Josef Weidendorfer 2014-09-17 23:05:05 UTC

Created attachment 88733 [details]
test case

This may be the same reason as in bug 249435.

Compile test case with "cc t.c -o t".
Running "valgrind --tool=callgrind ./t" results in

==29303== Callgrind, a call-graph generating cache profiler
==29303== Copyright (C) 2002-2013, and GNU GPL'd, by Josef Weidendorfer et al.
==29303== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==29303== Command: ./t
==29303== 
==29303== For interactive control, run 'callgrind_control -h'.
Got signal 10 with SP=0xffeffe46c
0x0000000000036c30
BB# 28778

Callgrind: threads.c:246 (vgCallgrind_post_signal): Assertion 'sigNum == vgCallgrind_current_state.sig' failed.

host stacktrace:
==29303==    at 0x3803C11E: show_sched_status_wrk (m_libcassert.c:317)
==29303==    by 0x3803C234: report_and_quit (m_libcassert.c:376)
==29303==    by 0x3803C3B6: vgPlain_assert_fail (m_libcassert.c:441)
==29303==    by 0x38039CF2: vgCallgrind_post_signal (threads.c:246)
==29303==    by 0x380BC105: vgSysWrap_amd64_linux_sys_rt_sigreturn_before (syswrap-amd64-linux.c:501)
==29303==    by 0x3808CC64: vgPlain_client_syscall (syswrap-main.c:1586)
==29303==    by 0x380896A2: handle_syscall (scheduler.c:1086)
==29303==    by 0x3808ABD6: vgPlain_scheduler (scheduler.c:1392)
==29303==    by 0x3809A11C: run_a_thread_NORETURN (syswrap-linux.c:103)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==29303==    at 0x4C5DBB9: raise (raise.c:56)
==29303==    by 0x40084B: main (in /home/weidendo/tmp/vgaltstack/t)

Comment 1 Josef Weidendorfer 2014-09-17 23:19:56 UTC

This seems to screw up callgrind's internal shadow stack synchronization mechanism. When doing a return from the signal handler, we still are on the alternate stack which actually is within an old stack frame. Callgrind seems to confuse this with a longjmp, doing an automatic internal unwinding of its shadow stack down to the frame where the alternate stack was allocated from. Afterwards, Valgrind notifies Callgrind about leaving the signal handler, which does not match with callgrind's internal shadow stack (the signal frame got already unwound), resulting in the failed assertion.

Hmm. It seems that there needs to be a special case for this in shadow stack maintanance...

Comment 2 Julian Seward 2017-11-15 13:56:49 UTC

Josef, what is the possibility to get this fixed?  I have been running
firefox builds for 64-bit Windows, on Wine, on Callgrind, because I want
to profile them, and it fails for me in the same way, alas.

If it can't be easily fixed, is there any possible workaround?  I don't
care if I get slightly inaccurate profile results for a while, if that's
the short-term cost to keep Callgrind alive on such a workload.

Comment 3 Johannes Jordan 2019-09-20 18:14:47 UTC

It appears that because of this problem, we are still not able to run callgrind on mingw32-w64 builds with Wine.