Version: 3.1 SVN (using KDE KDE 3.3.2) Installed from: Fedora RPMs Compiler: gcc version 3.4.3 20041212 (Red Hat 3.4.3-9.EL4) App compiled with PathScale EKOPath 2.1 compiler OS: Linux I have an app compiled with the PathScale 2.2 compilers, which valgrind is unable to even load successfully. I am using the latest valgrind SVN trunk rev, x86_64, RHEL4. I get this crash: valgrind: the 'impossible' happened: Killed by fatal signal ==12103== at 0x7004BAC6: vgModuleLocal_read_debuginfo_dwarf2 (dwarf.c:924) ==12103== by 0x7002EB65: read_lib_symbols (symtab.c:1749) ==12103== by 0x7002ED51: vgPlain_read_seg_symbols (symtab.c:1803) ==12103== by 0x7002B286: vgPlain_di_notify_mmap (symtab.c:197) ==12103== by 0x70037954: vgModuleLocal_generic_PRE_sys_mmap (syswrap-generic.c:1807) ==12103== by 0x700483E6: vgSysWrap_amd64_linux_sys_mmap_before (syswrap-amd64-linux.c:1151) ==12103== by 0x70048A1E: vgPlain_client_syscall (syswrap-main.c:653) ==12103== by 0x70032FBD: handle_syscall (scheduler.c:618) ==12103== by 0x700332CF: vgPlain_scheduler (scheduler.c:720) ==12103== by 0x70053082: vgModuleLocal_thread_wrapper (syswrap-linux.c:82) ==12103== by 0x70044ADD: run_a_thread_NORETURN (syswrap-amd64-linux.c:117)
> I am using the latest valgrind SVN trunk rev, x86_64, RHEL4. Does readelf think this exe/.so is OK? Does it work with 3.0.1, or is this a regression? Either way .. basically we'll need the object to chase this down. Possible?
Created attachment 12792 [details] vg.out Complete valgrind output for the failing run, as collected with -v.
Comment on attachment 12792 [details] vg.out I have a newer output file.
Created attachment 12793 [details] vg.out Here's that newer output file.
I can't tell whether the app works with 3.0.1 any longer, as I don't have it any more. I doubt that it's a regression.
What does "readelf -S /usr/lib64/libmpichf90nc.so.2.0" say? Judging by the fact that the fault is on address zero I suspect we will find that it has a debug_line section but no debug_info section which is a bit odd. The symbol table reader in valgrind is currently using the presence of debug_line to indicate DWARF2 and then assuming that debug_info will be present. That combined with the fact that the loop in ML_(read_debuginfo_dwarf2) will go mad if the size of the debug_info section is less than four bytes would cause this sort of crash.
On Tue, 4 Oct 2005, Tom Hughes wrote: > The symbol table reader in valgrind is currently using the presence of > debug_line to indicate DWARF2 and then assuming that debug_info will be > present. So if (debug_line) { should become if (debug_line && debug_info) { ? And maybe add in (debug_info_sz > 4), and possibly check that debug_str and debug_abbv are non-NULL.
Something like that, yes.
> ? And maybe add in (debug_info_sz > 4), and possibly check that debug_str > and debug_abbv are non-NULL. Or perhaps .. for dealing with potentially explosive pointers like this, we could inquire with aspacem whether it's safe to dereference (perhaps by checking that the pointer points into the same segment that the .so has been transiently mmaped into). It's two calls to VG_(am_find_nsegment), but that's not catastrophically expensive since it's a binary search now.
In message <20051004153039.9255.qmail@ktown.kde.org> Julian Seward <jseward@acm.org> wrote: > Or perhaps .. for dealing with potentially explosive pointers like > this, we could inquire with aspacem whether it's safe to dereference > (perhaps by checking that the pointer points into the same > segment that the .so has been transiently mmaped into). It's > two calls to VG_(am_find_nsegment), but that's not catastrophically > expensive since it's a binary search now. Well the address should be part of memory that we have just mmaped so it shouldn't really be bogus... The problem here is that there was no such section in the file so the default value of zero is still in the variable. Tom
By the way, the compiler in question has a history of generating somewhat questionable debug info. If you think it's a compiler bug, valgrind should probably still handle it (since the compiler is in the wild), but please let me know, and I'll make sure any bug gets fixed. Thanks.
I have committed a fix for this as revision 4856.