Bug 211529 - valgrind doesn't show proper call stacks for programs compiled by newer versions of visual c++
Summary: valgrind doesn't show proper call stacks for programs compiled by newer versi...
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: unspecified
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-10-23 12:43 UTC by Dan Kegel
Modified: 2015-02-23 22:05 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
valgrind log with msvc 2010 compiled dll (2.62 MB, text/plain)
2014-07-12 05:30 UTC, Austin English
Details
pdb filename patch (2.70 KB, patch)
2014-09-17 06:33 UTC, Mark Browning
Details
unified diff for readpdb.c (6.84 KB, patch)
2015-02-17 15:42 UTC, Mark Browning
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Kegel 2009-10-23 12:43:17 UTC
Version:            (using Devel)
OS:                Linux
Installed from:    Compiled sources

valgrind-3.5.0 borrowed some pdb parsing fu from Wine,
and its release notes say
"Improved Wine support, including ability to read Windows PDB debuginfo"
But there are no test cases to verify this, and it seems
there are some problems left.

I put together a set of scripts that makes it somewhat easy
to test basic stack traceback support in both valgrind+wine
and plain wine across several versions of visual C++.
It's at 
  http://kegel.com/wine/valgrind/pdbdemo.tgz
Details at
  http://kegel.com/wine/valgrind/pdbdemo/Readme.txt

It boils down to
          wine                 valgrind
vc2003    great!               no good stack?!
vc2005    no line numbers      no good stack?!
vc2008    no line numbers      no line numbers

Wine at least provides call stacks; valgrind fails to
for two of the compilers.
Wine even provides line numbers for vc2003, but 
valgrind doesn't do that for any tested release.
(Probably it does for vc 6, have to test that still.)

The test program could probably use some improvement, too.
Ideally we'd like memcheck/tests to all compile, build, and
(to the extent they're not linux-specific) pass, but
that's beyond the scope of fixing pdb support.
Comment 1 Dan Kegel 2009-11-01 15:13:05 UTC
http://bugs.winehq.org/show_bug.cgi?id=835#c13
has a fix for the wine side.  It would probably
be pretty easy for someone to do the same on the
valgrind side.
Comment 2 Tom Hughes 2009-11-01 16:49:56 UTC
It looks like that patch hasn't actually been applied to wine yet? It would probably be better until there is a final version that has been accepted and applied to wine before we think about trying to lift it.
Comment 3 Dan Kegel 2009-11-04 07:13:17 UTC
It's committed now, as of circa
http://www.winehq.org/pipermail/wine-cvs/2009-November/061171.html
Comment 4 Julian Seward 2010-01-25 12:34:35 UTC
A somewhat more general question, based on the premise that wine's
debuginfo reading code should serve as a guide: Do you have a feel
for how well wine supports vc2008 as opposed to vc2005 ?  I get the
impression vc2008 is essentially unsupported even by winedump.
I get winedump segfaulting reading vc2008-generated pdbs even for a
trivial program (memcheck/tests/errs1.c compiled with "/Zi").
Comment 5 Julian Seward 2010-01-26 03:12:03 UTC
(In reply to comment #1)
> http://bugs.winehq.org/show_bug.cgi?id=835#c13
> has a fix for the wine side.  It would probably
> be pretty easy for someone to do the same on the
> valgrind side.

Dan, I presume this hasn't happened yet, else you would
have posted the patch by now, yes?
Comment 6 Dan Kegel 2010-01-26 04:42:46 UTC
I don't think anyone has.

I'm just finishing up getting Chromium to build under Wine with msvc 2005; I'll try to get back to valgrinding wine now.  And I'll have a look at the msvc 2008 winedump problem you mentioned.
Comment 7 Julian Seward 2010-01-26 14:59:41 UTC
Might not be easy to do.  Looks like the winedump sources are
now very different to Valgrind's readpdb.c.
Comment 8 Julian Seward 2010-02-15 12:01:23 UTC
Accepting as a legit bug, but needs someone to do the heavy lifting.

FWIW, current trunk (r11046) handles line number and symbol info
fairly well for MSVC2008 compiled code, but only for non-optimised builds.
Comment 9 Austin English 2014-07-12 05:30:55 UTC
Created attachment 87700 [details]
valgrind log with msvc 2010 compiled dll

I recently compiled wine_gecko with msvc 2010. I've attached what I get with r14155
Comment 10 Mark Browning 2014-09-17 06:33:20 UTC
Created attachment 88717 [details]
pdb filename patch

I've significantly improved the parsing of pdb files generated with MSVC2010 (release/optimized build). Not significantly tested, but produces stack traces like:

==25920==    by 0x7BC7805A: server_select (in /usr/lib/i386-linux-gnu/wine/ntdll.dll.so)
==25920==    by 0x7BC80797: NtWaitForMultipleObjects (in /usr/lib/i386-linux-gnu/wine/ntdll.dll.so)
==25920==    by 0x7BC807F5: NtWaitForSingleObject (in /usr/lib/i386-linux-gnu/wine/ntdll.dll.so)
==25920==    by 0x7BC4115E: server_ioctl_file (in /usr/lib/i386-linux-gnu/wine/ntdll.dll.so)
==25920==    by 0x7BC427BB: NtDeviceIoControlFile (in /usr/lib/i386-linux-gnu/wine/ntdll.dll.so)
==25920==    by 0x7B84192F: DeviceIoControl (in /usr/lib/i386-linux-gnu/wine/kernel32.dll.so)
==25920==    by 0x7B87E9E9: GetVolumeInformationW (in /usr/lib/i386-linux-gnu/wine/kernel32.dll.so)
==25920==    by 0x787EBEAF: ?_AfxFullPath2@@YGHPA_WPB_WPAVCFileException@@@ (in /media/psf/Home/ERA 3/mfc100u.dll)
==25920==    by 0x787EB4D6: ?Open@CFile@@UAEHPB_WIPAVCFileException@@@Z (in /media/psf/Home/ERA 3/mfc100u.dll)
==25920==    by 0x406697: CUtility::FileExists (utility.cpp:461)
==25920==    by 0x404051: CAppSettingsMgr::Initialize (appsettingsmgr.cpp:130)
==25920==    by 0x401211: CMainApp::InitInstance (tad10.cpp:119)
==25920==    by 0x788477C5: ?AfxWinMain@@YGHPAUHINSTANCE__@@0PA_WH@ (in /media/psf/Home/ERA 3/mfc100u.dll)
==25920==    by 0xBD7CC7: __tmainCRTStartup (crtexe.c:547)
==25920==    by 0x7B85E5CB: ??? (in /usr/lib/i386-linux-gnu/wine/kernel32.dll.so)
==25920==    by 0x7B85F652: start_process (in /usr/lib/i386-linux-gnu/wine/kernel32.dll.so)
==25920==    by 0x7BC799AF: ??? (in /usr/lib/i386-linux-gnu/wine/ntdll.dll.so)
==25920==    by 0x7BC7C93C: call_thread_func (in /usr/lib/i386-linux-gnu/wine/ntdll.dll.so)
==25920==    by 0x7BC7998D: ??? (in /usr/lib/i386-linux-gnu/wine/ntdll.dll.so)
==25920==    by 0x7BC4E8FD: start_process (in /usr/lib/i386-linux-gnu/wine/ntdll.dll.so)
==25920==    by 0x403050C: ??? (in /usr/lib/i386-linux-gnu/libwine.so.1.0)
Comment 11 Julian Seward 2015-02-17 15:11:33 UTC
(In reply to Mark Browning from comment #10)
> Created attachment 88717 [details]
> pdb filename patch

Sounds interesting, but the patch isn't apply-able (no context lines)
and also seems to be missing some bits, in particular the addition of
struct PDB_{JG,DS}_ROOT* in struct pdb_reader.u.  Can you pls attach
a more complete patch?
Comment 12 Mark Browning 2015-02-17 15:42:30 UTC
Created attachment 91133 [details]
unified diff for readpdb.c

Sure thing. Its been a few months since I've touched this, but this is a more complete (and unified) patch. 

It touches 2 files: coregrind/m_debuginfo/readpdb.c and coregind/m_deduppoolalloc.c. The latter was necessary because all the file paths in the debuginfo were absolute, so started with the same long string and were considered duplicate by this allocator. I made the string length part of the "key" in an attempt to mitigate this, but there are probably more optimal solutions. I also stopped developing it once I got it working well enough for my purpose, so there are debug statements and things that should probably be removed.

This is actually adopted from a file in the wine project by Eric Pouech: https://github.com/wine-mirror/wine/blob/master/tools/winedump/pdb.c
Comment 13 Mark Browning 2015-02-17 15:45:32 UTC
This is the git commit I modeled the change to valgrind after: https://github.com/wine-mirror/wine/commit/2cba84027b4b4ec54141de9942e4bd5cc1a1829c
Comment 14 Julian Seward 2015-02-18 08:26:39 UTC
(In reply to Mark Browning from comment #12)
Thanks for the updated patch.

> It touches 2 files: coregrind/m_debuginfo/readpdb.c and
> coregind/m_deduppoolalloc.c. The latter was necessary because all the file
> paths in the debuginfo were absolute, so started with the same long string
> and were considered duplicate by this allocator. I made the string length
> part of the "key" in an attempt to mitigate this, but there are probably
> more optimal solutions.

I'm a bit confused by this.  As far as I can see, m_deduppoolalloc
should not consider the strings to be duplicates unless they really
are.  Are you saying that it commoned up some nearly but not actually
identical strings, when it should not have done so?  Or are you saying
that readpdb.c somehow requires identical copies of strings to remain
separate?  I'm reluctant to apply the m_deduppoolalloc change because
I think it changes the semantics of m_deduppoolalloc.  But only Philippe
can tell us that for sure.
Comment 15 Julian Seward 2015-02-18 12:57:52 UTC
Committed, with mods, r14942.  Thanks for the patch.  I removed the m_deduppoolalloc
change -- didn't seem to be necessary.
Comment 16 Mark Browning 2015-02-18 16:02:46 UTC
You make the decisions that are right for the project, I'm just a user with a patch ;)

I believe that the problem is the dedup isn't checked if the string matches exactly, it was sufficient for the adler32 hash/key of the string to match. However, I was getting hash collisions and not-duplicate strings were treated as duplicates. Adding the "length" of the block to be allocated to the key schedule improved the collisions for the use case I had; I don't think it changed the semantics, just the implementation details. However, it is a reasonably hasty change made 7 months prior, things might be working now. 

Unfortunately, its not obvious it is happening until you're actually trying to debug using a stack trace and things don't make sense at all given your knowledge of the as-written callchain.
Comment 17 Philippe Waroquiers 2015-02-23 22:05:14 UTC
(In reply to Mark Browning from comment #16)

> I believe that the problem is the dedup isn't checked if the string matches
> exactly, it was sufficient for the adler32 hash/key of the string to match.
If that is the case, then that would be a critical bug in the dedup pool:
the idea of the dedup pool is to store once identical elements.
If ever non identical elements would get 'merged', that would cause a disaster.
The adler32 hash/key is used as a fast way to search in hash table
and as a fast first criterion to compare elements.
However, when the hash key is identical, the full element is compared.
See m_deduppoolalloc.c, function cmp_pool_elt, and
m_hashtable.c VG_(HT_gen_lookup)

> However, I was getting hash collisions and not-duplicate strings were
> treated as duplicates. 
Are you sure ? I cannot see how that can happen, and m_deduppoolalloc.c
is used in several critical data structures (cfi unwind information,
file name, directory name, ...).
I am running now a slightly modified version, where the adler32 call is removed,
and the key is just assigned 1 for all elements.
That significantly slows down valgrind (as the dedup is now done via
a linear search) but no functional effect is being observed (on linux).

> Adding the "length" of the block to be allocated to
> the key schedule improved the collisions for the use case I had; I don't
> think it changed the semantics, just the implementation details. However, it
> is a reasonably hasty change made 7 months prior, things might be working
> now. 
> 
> Unfortunately, its not obvious it is happening until you're actually trying
> to debug using a stack trace and things don't make sense at all given your
> knowledge of the as-written callchain.
I suspect that what you observed was caused by something else, maybe
incorrect debug info, or another bug in the pdb debug info reader.

But in case you can reproduce such non identical elements being merged by the dedup
pool then please open another (critical) bug.

Thanks