Bug 395991

Summary: wine's unit tests enter a signal delivery loop under valgrind on armv7l when SIGSEGV is used
Product: [Developer tools] valgrind Reporter: Austin English <austinenglish>
Component: generalAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal    
Priority: NOR    
Version First Reported In: 3.14 SVN   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:
Attachments: A possible fix
valgrind output

Description Austin English 2018-06-29 07:47:05 UTC
This affects several tests on armv7l (arm64 and amd64 are NOT affected). The list of affected tests so far:
dlls/comctl32/tests/imagelist.ok
dlls/comctl32/tests/treeview.ok
dlls/crypt32/tests/encode.ok
dlls/d3d8/tests/visual.ok
dlls/d3d9/tests/d3d9ex.ok
dlls/d3d10core/tests/device.ok
dlls/d3d11/tests/d3d11.ok
dlls/imm32/tests/imm32.ok
dlls/kernel32/tests/file.ok
dlls/kernel32/tests/heap.ok
dlls/kernel32/tests/thread.ok
dlls/kernel32/tests/virtual.ok
dlls/msacm32/tests/msacm.ok
dlls/msvcp90/tests/misc.ok
dlls/ntdll/tests/file.ok
dlls/ntdll/tests/time.ok
dlls/oleaut32/tests/safearray.ok
dlls/oleaut32/tests/tmarshal.ok
dlls/ws2_32/tests/sock.ok

I ran with --trace-signals=yes, which showed an infinite loop. E.g., for dlls/kernel32/tests/heap, I got:

--7171-- sync signal handler: signal=11, si_code=1, EIP=0x546a784, eip=0x687bbd4c, from kernel
--7171-- SIGSEGV: si_code=1 faultaddr=0xdeadbeed tid=1 ESP=0x5affb00 seg=0xbebf7000-0xfffeffff
--7171-- delivering signal 11 (SIGSEGV):1 to thread 1
--7171-- push_signal_frame (thread 1): signal 11
==7171==    at 0x546A784: GlobalFree (heap.c:762)
==7171==    by 0x5854AA3: test_heap (heap.c:220)
==7171==    by 0x58560F3: func_heap (heap.c:1233)
==7171==    by 0x57EE75F: main (test.h:615)
--7171-- VG_(sigframe_create): continuing in handler with PC=0x4fc627c
--7171-- vg_pop_signal_frame (thread 1): isRT=1 valid magic; PC=0x546a784

endlessly.

Speaking with Julian about it, armv7l may need some extra work that's already done for arm64. I need to track down some extra info for him, first:
======================
So the program takes a segfault at 0x546A784 which is GlobalFree (heap.c:762),
runs the handler, which returns (unusual for a segfault handler), and it then
restarts the faulting insn 0x546A784, so it continues indefinitely.

Can you dig into this a bit further, to verify what's going on?  I'd prefer
to be a bit more sure about it before hacking up a patch.  In particular:

(1) do you expect there to be a segfault raised at heap.c:762 ?

(2) can you identify the fault handler being run?  How is the program
    supposed to make forward progress after the handler returns?  Does
    it modify the PC in the mcontext, or is there something else going on?
    If you can find the handler, can you give me a pointer to it?

Number (2) is really the most important.
======================

I'm travelling, so that will likely be next week. I wanted to a have a bug filed that I could share with wine guys in the meantime, though ;)
Comment 1 Austin English 2018-06-29 07:48:04 UTC
Using valgrind-3.14.0.GIT-90daa486e8-20180620
Comment 2 Austin English 2018-08-05 21:25:35 UTC
4:49:22 AM	austin_laptop  if someone with decent understanding of kernel32/heap.c has a few minutes, I'd appreciate if someone could help me answer Julian's questions from https://bugs.kde.org/show_bug.cgi?id=395991 (I'm happy to forward your answers if you don't have a KDE bz account)
5:01:50 AM	_Marcus_  austin_laptop: so if we take a segfault, the handler would run into the expcetion chain
5:02:02 AM	_Marcus_  austin_laptop: but in general, globalalloc should not fault in my opinion
5:02:54 AM	_Marcus_      SetLastError(MAGIC_DEAD);
5:02:54 AM	_Marcus_      hsecond = GlobalFree(LongToHandle(0xdeadbeef)); /* bogus handle */
5:03:01 AM	_Marcus_  hmm, but this code does
5:03:07 AM	austin_laptop  thanks _Marcus_. Yeah, from a quick read of the code, I didn't see why it would segfault, but wasn't completely sure
5:03:29 AM	_Marcus_  so basically we are running into the TRY / __EXCEPT_PAGE_FAULT block in GlobalFree
5:03:37 AM	_Marcus_  the testcase causes it to fault
5:03:47 AM	_Marcus_  but this is our generic exceptipon handling
5:04:10 AM	austin_laptop  ok
5:06:40 AM	_Marcus_  something in the arm signal handling is not fully correct I would think :/
5:07:11 AM	austin_laptop  yeah, that's likely
5:07:36 AM	austin_laptop  this is armv7l rather than arm64 (and at least on VG side, is not as complete)
5:07:37 AM	_Marcus_  winedebug +heap should print the         ERR("invalid handle %p\n", hmem);
5:07:48 AM	_Marcus_  from GlobalFree, if it does not ... we are not getting there for some reason

Note: I ran with +heap, but do not get that message, so something may be broke in wine's armv7l exception handling as well.
Comment 3 Julian Seward 2018-09-17 13:07:05 UTC
Created attachment 115037 [details]
A possible fix

Austin, I think this might fix it.  But I can only test it on
my tiny test case.  Can you cause it to be tested on Wine?
Comment 4 Julian Seward 2018-09-18 07:55:58 UTC
Fixed (I think!), 43115c8058052fdb0e1c5dd76ec286519e2bfe78.
Comment 5 Austin English 2018-09-18 09:22:11 UTC
Created attachment 115058 [details]
valgrind output

While the situation has changed, it still differs from what I see on amd64. Log attached.
Comment 6 Julian Seward 2018-09-18 09:49:02 UTC
(In reply to Austin English from comment #5)
> While the situation has changed, it still differs from what I see on amd64.
> Log attached.

Hmm.  If I had to guess, I'd say that the signal frame that V creates
doesn't have enough details in it to keep the Wine signal handler happy.

Austin, can you show me the source of the signal handler involved?
Comment 7 Austin English 2018-09-19 05:38:35 UTC
(In reply to Julian Seward from comment #6)
> (In reply to Austin English from comment #5)
> > While the situation has changed, it still differs from what I see on amd64.
> > Log attached.
> 
> Hmm.  If I had to guess, I'd say that the signal frame that V creates
> doesn't have enough details in it to keep the Wine signal handler happy.
> 
> Austin, can you show me the source of the signal handler involved?

Yeah, it's here:
https://source.winehq.org/git/wine.git/blob/bfad5527acacbdbac623da57413a60c532218423:/dlls/kernel32/heap.c#l741

FYI wine's development IRC channel is #winehackers if you want to get more info (I think you already knew that, but just in case).
Comment 8 Julian Seward 2018-09-19 07:03:32 UTC
(In reply to Austin English from comment #7)
 > > Austin, can you show me the source of the signal handler involved?
> 
> Yeah, it's here:
> https://source.winehq.org/git/wine.git/blob/
> bfad5527acacbdbac623da57413a60c532218423:/dlls/kernel32/heap.c#l741

Hmm.  While that is probably related, it is not a signal handler,
and it also isn't the source of the messages you showed in comment 6.

The signal handling magic bits will somehow be hidden in the macros
__TRY (line 747), __EXCEPT_PAGE_FAULT (line 785) and __ENDTRY (line 791).
Can you show the definitions of those, so we can find the actual handler?

A signal handler will have signature "void handler(int signo)" or
(more likely) "void handler(int signo, siginfo_t* siginfo, void* context)".
Comment 9 Austin English 2018-09-19 07:23:19 UTC
(In reply to Julian Seward from comment #8)
> (In reply to Austin English from comment #7)
>  > > Austin, can you show me the source of the signal handler involved?
> > 
> > Yeah, it's here:
> > https://source.winehq.org/git/wine.git/blob/
> > bfad5527acacbdbac623da57413a60c532218423:/dlls/kernel32/heap.c#l741
> 
> Hmm.  While that is probably related, it is not a signal handler,
> and it also isn't the source of the messages you showed in comment 6.
> 
> The signal handling magic bits will somehow be hidden in the macros
> __TRY (line 747), __EXCEPT_PAGE_FAULT (line 785) and __ENDTRY (line 791).
> Can you show the definitions of those, so we can find the actual handler?
> 
> A signal handler will have signature "void handler(int signo)" or
> (more likely) "void handler(int signo, siginfo_t* siginfo, void* context)".

Thanks for the pointer / sorry for the mixup. I believe what you're looking for is in:
https://source.winehq.org/git/wine.git/blob/bfad5527acacbdbac623da57413a60c532218423:/include/wine/exception.h#l101