for compressed debuginfo valgrind doesn't display source/line info. Reproducible: Always Steps to Reproduce: $ cat t.cpp int* p = 0; int main() { *p = 0; } $ gcc t.cpp -gdwarf-4 -g2 -o t $ valgrind ./t ==10551== Invalid write of size 4 ==10551== at 0x4004F7: main (t.cpp:4) $ objcopy --compress-debug-sections ./t $ valgrind ./t ==10568== Invalid write of size 4 ==10568== at 0x4004F7: main (in /home/users/pawels/bugs/t)
Neither it seems do lots of other things - is this something the DWARF standard actually allows? or just some wierd GNU extension? In any case it's probably unlikely that valgrind would support it as it would require linking in a copy of zlib...
Technology note for anybody that looks at this in the future - the implementation of this feature appears to be that each .debug_xxx section is replaced by a .zdebug_xxx section that has been zlib compressed. Note that some binutils tools like objdump will attempt to hide this fact and show the original section, decompressing on the fly if you try and read it.
(In reply to comment #2) > Technology note for anybody that looks at this in the future - the > implementation of this feature appears to be that each .debug_xxx section is > replaced by a .zdebug_xxx section that has been zlib compressed. > > Note that some binutils tools like objdump will attempt to hide this fact > and show the original section, decompressing on the fly if you try and read > it. binutils/gdb contain small bfd/compress.c utility for handling .zdebug_* sections.
(In reply to comment #1) > In any case it's probably unlikely that valgrind would support it as it > would require linking in a copy of zlib... I don't see that as a big deal technically .. so that would just leave the question of whether it is OK from a license-compatbility point of view. Assuming, of course, that we decide to support this. How does this relate to the DWZ initiative?
It's completely unrelated to DWZ I think.
Do we need to move this forward? Is .zdebug gaining mainstream acceptance?
I think this bug is the only reference I've come across to it. As far as I know DWZ is the more mainstream approach at the moment.
dwz is completely different thing. dwz optimizes dwarf data while objdump just compress (zlib) whole sections. the difference is visible on non-trivial shared objects compiled with -g2 in few ways: 1). when you have a big application with 200+ big shared objects in use then the gdb startup time is much shorter with compressed debug info. this is a major benefit for debug/build/debug cycles. 2). relinking libraries is faster (linker reads/writes less data from deps with compressed sections). one more benefit for debug/build cycles. 3). objdump compression is more effective than dwz. 170M libgenChip.so 65M libgenChip.so.zdebug 148M libgenChip.so.dwz 60M libgenChip.so.dwz.zdebug
> Do we need to move this forward? Is .zdebug gaining mainstream acceptance? Yes and yes. Debian now defaults to compressing their debugging symbols (this change was made in debhelper v9, if anyone's curious). On x86_64 you can at least get call stacks, since .eh_frame is mandatory, but on, say, ARM or ARM64 it's not. So the call stacks stop at the first C-only library.
(In reply to Роман Донченко from comment #9) > > Do we need to move this forward? Is .zdebug gaining mainstream acceptance? > > Yes and yes. Debian now defaults to compressing their debugging symbols > (this change was made in debhelper v9, if anyone's curious). Debian does not seems to use GNU .zdebug compression, but a different kind of ELF section compression. They seem to have adopted the new ELF standardized SHF_COMPRESSED variant. Which is somewhat strange since I don't believe any release of binutils, nor elfutils, supports that yet (it will be in binutils 2.26 and elfutils 0.165, but those have not yet been released). So effectively nothing supports that kind of ELF section compression yet. It is also a bit questionable whether this kind of compression of debug sections is really beneficial. It prevents easy indexing and lazy loading of data, causing huge startup delays whenever any debuginfo is needed. It would be better to adopt dwz style compression. > On x86_64 you can at least get call stacks, since .eh_frame is mandatory, > but on, say, ARM or ARM64 it's not. So the call stacks stop at the first > C-only library. That is a different Debian bug though, gcc: Enable -fasynchronous-unwind-tables on more arches: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=746426
(In reply to Mark Wielaard from comment #10) > Debian does not seems to use GNU .zdebug compression, It does, though not for all packages yet. Here's an example: <https://packages.debian.org/jessie/debug/libopenjpeg5-dbg>. > On x86_64 you can at least get call stacks, since .eh_frame is mandatory, > > but on, say, ARM or ARM64 it's not. So the call stacks stop at the first > > C-only library. > > That is a different Debian bug though, gcc: Enable > -fasynchronous-unwind-tables on more arches: > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=746426 I wouldn't really call that a bug. Yeah, they don't build with that flag, but they don't have to. Debugging symbols provide the same information.
(In reply to Роман Донченко from comment #11) > (In reply to Mark Wielaard from comment #10) > > Debian does not seems to use GNU .zdebug compression, > > It does, though not for all packages yet. Here's an example: > <https://packages.debian.org/jessie/debug/libopenjpeg5-dbg>. O, indeed that one does use GNU .zdebug compression. But others like https://packages.debian.org/stretch/libc6-dbg seem to use a different ELF section compression method (use normal .debug names and set the SHF_COMPRESSED flag to indicate that the section data contains a ELF_Chdr plus compressed data). Now I am slightly confused, what is Debian really using and why? > > On x86_64 you can at least get call stacks, since .eh_frame is mandatory, > > > but on, say, ARM or ARM64 it's not. So the call stacks stop at the first > > > C-only library. > > > > That is a different Debian bug though, gcc: Enable > > -fasynchronous-unwind-tables on more arches: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=746426 > > I wouldn't really call that a bug. Yeah, they don't build with that flag, > but they don't have to. Debugging symbols provide the same information. Except that most people won't have the debug files installed, not all packages have them and if they use GNU .zdebug or ELF section compression then, as this bug points out, valgrind won't be able to use them :) Other distros do use -fasynchronous-unwind-tables on all arches to make sure the unwind tables are always there, which does make lots of tools (not just valgrind) happy.
(In reply to Mark Wielaard from comment #12) > (In reply to Роман Донченко from comment #11) > > It does, though not for all packages yet. Here's an example: > > <https://packages.debian.org/jessie/debug/libopenjpeg5-dbg>. > > O, indeed that one does use GNU .zdebug compression. But others like > https://packages.debian.org/stretch/libc6-dbg seem to use a different ELF > section compression method (use normal .debug names and set the > SHF_COMPRESSED flag to indicate that the section data contains a ELF_Chdr > plus compressed data). > > Now I am slightly confused, what is Debian really using and why? I see what's going on now. stretch (being in development) uses a bleeding-edge version of binutils, which appears to have changed the semantics of --compress-debug-sections. Now it creates SHF_COMPRESSED .debug sections rather than .zdebug sections. jessie uses binutils 2.25, so it has the .zdebug sections. So it does appear that SHF_COMPRESSED is the future. .zdebug, however, is the present, so ideally Valgrind should support both.
Since libc and a number of binaries have now been built with this compressed debug info on Debian (and quite possibly on its derivatives, too), valgrind is sadly becoming less and less useful. Any progress on this request would be much appreciated! Thanks, Julian
It's highly unlikely that there will be any quick progress, because any fix would be require valgrind to be able to decompress zlib compressed data but valgrind is not able to link to libraries so we can't just use zlib like most programs would.
FWIW, it's not necessary to use zlib itself. For example, miniz (https://github.com/richgel999/miniz) is a single-file library that implements zlib decoding.
That's certainly useful to know about - it even has an "inflate only" version. It will still need some patching to stop it use memcpy/memset etc.
An alternative is also the simple/super small 'inflate' implementation in zlib code zlib-1.2.8/contrib/puff.h and puff.c This is a fully independent inflate implementation (no #include). There are some drawbacks (2 times slower than the real zlib inflate, and as it does not do memory allocation, inflate fails if the target buffer is too small (and so, you must redo the inflate with a bigger buffer). If the debug info stores the uncompressed size, then this is not a problem
Created attachment 97607 [details] Compressed debug sections support for Valgrind - rev 1 I suggest the following solution. The patch has been tested on MIPS32/64. Applying the patch: patch -p1 < compressed-dwarf-support.diff Note: tinfl.c from miniz project (https://github.com/richgel999/miniz) is used without modifications.
Aleksandar, great, that works for postgres on debian unstable! I do however get a number of mostly harmless warnings: In file included from m_debuginfo/image.c:52:0: ./tinfl.c: In function ‘tinfl_decompress_mem_to_heap’: ./tinfl.c:514:11: warning: left-hand operand of comma expression has no effect [-Wunused-value] MZ_FREE(pBuf); *pOut_len = 0; return NULL; ^ In file included from m_debuginfo/image.c:52:0: ./tinfl.c:523:11: warning: left-hand operand of comma expression has no effect [-Wunused-value] MZ_FREE(pBuf); *pOut_len = 0; return NULL; ^ In file included from m_debuginfo/image.c:52:0: ./tinfl.c: In function ‘tinfl_decompress_mem_to_callback’: ./tinfl.c:560:8: warning: left-hand operand of comma expression has no effect [-Wunused-value] MZ_FREE(pDict); ^ In file included from m_debuginfo/image.c:52:0: ./tinfl.c:29:0: warning: "MINIZ_HAS_64BIT_REGISTERS" redefined #define MINIZ_HAS_64BIT_REGISTERS 1 ^ m_debuginfo/image.c:51:0: note: this is the location of the previous definition #define MINIZ_HAS_64BIT_REGISTERS ( VG_WORDSIZE == 8 ) ^ In file included from m_debuginfo/image.c:52:0: ./tinfl.c: In function ‘tinfl_decompress_mem_to_heap’: ./tinfl.c:514:11: warning: left-hand operand of comma expression has no effect [-Wunused-value] MZ_FREE(pBuf); *pOut_len = 0; return NULL; ^ In file included from m_debuginfo/image.c:52:0: ./tinfl.c:523:11: warning: left-hand operand of comma expression has no effect [-Wunused-value] MZ_FREE(pBuf); *pOut_len = 0; return NULL; ^ In file included from m_debuginfo/image.c:52:0: ./tinfl.c: In function ‘tinfl_decompress_mem_to_callback’: ./tinfl.c:560:8: warning: left-hand operand of comma expression has no effect [-Wunused-value] MZ_FREE(pDict); ^
Created attachment 97742 [details] Avoid warnings from tinfl.c In order to avoid these warnings we need to modify tinfl.c in some way. I suggest the attached solution. MZ_MALLOC/REALLOC/FREE, memcpy, memset, etc. are replaced with appropriate Valgrind functions. Tinfl is also added to Makefile, so it can be used from other parts of project. The patch can be applied after applying compressed-dwarf-support.diff. To apply the patch use command: patch -p1 < tinfl-modifications.diff The alternative solution is to delete "High level decompression functions" from tinfl.c.
These patches don't work with the current release: ../coregrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-image.o): In function `get_slowcase': /tmp/valgrind/valgrind-3.11.0/coregrind/m_debuginfo/image.c:636: undefined reference to `tinfl_decompress_mem_to_mem' collect2: error: ld returned 1 exit status make[3]: *** [memcheck-amd64-linux] Error 1
It looks strange... Have you run ./autogen.sh before compiling ? I've tested the same configuration (amd64, linux) with trunk and VALGRIND_3_11_0 revisions. However, the patch is not tested on arm and ppc. (In reply to Quanah Gibson-Mount from comment #22) > These patches don't work with the current release: > > ../coregrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-image.o): > In function `get_slowcase': > /tmp/valgrind/valgrind-3.11.0/coregrind/m_debuginfo/image.c:636: undefined > reference to `tinfl_decompress_mem_to_mem' > collect2: error: ld returned 1 exit status > make[3]: *** [memcheck-amd64-linux] Error 1
(In reply to Aleksandar Rikalo from comment #23) > It looks strange... > Have you run ./autogen.sh before compiling ? No, I didn't realize that was required. I'll give it another shot, as I also need the patch to correctly handle software that dl_close's objects as well (https://bugs.kde.org/show_bug.cgi?id=79362)
That patch doesn't build on Debian amd64, at least on top of Debian's version of valgrind: gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../include -I../VEX/pub -I../VEX/pub -DVGA_amd64=1 -DVGO_linux=1 -DVGP_amd64_linux=1 -DVGPV_amd64_linux_vanilla=1 -I../coregrind -DVG_LIBDIR="\"/usr/lib/valgrind"\" -DVG_PLATFORM="\"amd64-linux\"" -m64 -O2 -g -std=gnu99 -Wall -Wmissing-prototypes -Wshadow -Wpointer-arith -Wstrict-prototypes -Wmissing-declarations -Wcast-align -Wcast-qual -Wwrite-strings -Wempty-body -Wformat -Wformat-security -Wignored-qualifiers -Wmissing-parameter-type -Wold-style-declaration -fno-stack-protector -fno-strict-aliasing -fno-builtin -fomit-frame-pointer -DENABLE_LINUX_TICKET_LOCK -g -O2 -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -I/usr/include/x86_64-linux-gnu -c -o m_debuginfo/libcoregrind_amd64_linux_a-readelf.o `test -f 'm_debuginfo/readelf.c' || echo './'`m_debuginfo/readelf.c m_debuginfo/readelf.c:67:6: error: conflicting types for ‘Elf32_Chdr’ } Elf32_Chdr; ^ In file included from m_debuginfo/readelf.c:56:0: /usr/include/elf.h:385:3: note: previous declaration of ‘Elf32_Chdr’ was here } Elf32_Chdr; ^ m_debuginfo/readelf.c:76:6: error: conflicting types for ‘Elf64_Chdr’ } Elf64_Chdr; ^ In file included from m_debuginfo/readelf.c:56:0: /usr/include/elf.h:393:3: note: previous declaration of ‘Elf64_Chdr’ was here } Elf64_Chdr; ^ Makefile:4878: recipe for target 'm_debuginfo/libcoregrind_amd64_linux_a-readelf.o' failed
Thank you for providing the patches. I get the same compilation problem on Solaris because Elf32/64_Chdr come from the VKI interface. I think the problem lies here in coregrind/m_debuginfo/readelf.c: ============================ +#if !defined(Elf32_Chdr) + typedef struct { + Elf32_Word ch_type; + Elf32_Word ch_size; + Elf32_Word ch_addralign; + } Elf32_Chdr; +#endif ... ============================ However it is incorrect to assume that a typedef can be checked via #if defined(). Morever, such definitions should be placed in the corresponding vki-* header files (if really necessary). After removing Elf32_Chdr, Elf64_Chdr, SHF_COMPRESSED and ELFCOMPRESS_ZLIB from readelf.c, the patches compile fine and regression testing went ok on Solaris. It would be also nice if a simple test is provided. Ideally the configure would check if the corresponding command line option is supported and has an effect. Then it would enable/disable the test based on its availability. Definitions of MINIZ_LITTLE_ENDIAN and MINIZ_HAS_64BIT_REGISTERS (possibly others) need to be consistent across the board. Currently these are set in image.c and tinfl.c itself, based on different criteria. Stick to those in image.c. Also placing tinfl.c directly into coregrind is not appropriate. If m_debuginfo is the only consumer than tinfl.c needs to be moved there. If not, then pub_core_tinfl.h needs to present a proper Valgrind interface usable and understandable by other parts of the core, not the tinfl.c mess.
Created attachment 98286 [details] Corrections according to comment #26
Created attachment 98287 [details] .zdebug sections support
Created attachment 98288 [details] Test cases
Thank you for suggestions. I've prepared few patches: ElfXX_Chdr-build-fixup.diff - tinfl is moved to m_debuginfo, checking for Elf32/64_Chdr structs is placed into "configure" and line endings in tinfl.c are converted to Unix style. Elf32/64_Chdr, SHF_COMPRESSED and ELFCOMPRESS_ZLIB come from libc (<elf.h>) and exist only in newer versions of libc. These structures/constants are not present in Kernel, so I didn't move them to vki. We can place them to separate header or pub_tool_libcbase.h, if you think it's more appropriate. zdebug-support.diff - support for .zdebug sections is added to readelf.c. Three kinds of compressed debug sections are supported by Binutils 2.26 (zlib, zlib-gnu and zlib-gabi) and I have tested Valgrind with all of them on MIPS/X86 Linux. According to the manual (https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html), GCC should support -gz=zlib and -gz=zlib-gnu options for generating compressed debug sections, but I haven't succeeded to set up the environment in which it works. I've written two simple Memchek tests (compressed_sections_test.diff) which use these options, but they are not tested properly. The order of applying patches (after applying compressed-dwarf-support.diff and tinfl-modifications.diff) : patch -p1 < ElfXX_Chdr-build-fixup.diff patch -p1 < zdebug-support.diff patch -p1 < compressed_sections_test.diff
Thank you for the patches and for addressing my comments. Your work is really appreciated. Overall it looks in a good shape now. Please fold all the patches into one, so it's better maintainable. I have only the following remaining comments: +++ configure.ac 1) Please use autoconf macros AC_CHECK_TYPE/AC_CHECK_TYPES for checking Elf{32,64}_Chdr typedefs. See for example: https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Generic-Types.html#Generic-Types +++ coregrind/m_debuginfo/image.c 2) When including "tinfl.c", do we want to define "TINFL_HEADER_FILE_ONLY"? 3) typo: + // Virtual size of the image = real size + size of uncopressed data uncopressed => uncompressed 4) typo: + // Number of compressed slices are used delete 'are' 5) CEnt.data has now non-fixed size. Why CACHE_ENTRY_SIZE is still used in various places around image.c; for example in alloc_CEnt() and realloc_CEnt()? 6) In function find_cslc(), please use {} braces for better readability and put 'return' statement to its own line. You can declare 'i' inside the for loop as we use C99. 7) In function find_cslc(), use proper type 'UInt' instead of 'int', to match that of 'cslc_used'. 8) Width of all lines is 80 characters max - occurrences in alloc_CEnt(), realloc_CEnt(), get_slowcase(), ML_(img_mark_compressed_part)(). 9) Should be two lines, really: + if (len > ce->size) len = ce->size; 10) typo: + // get copressed data 11) should be on separate lines: + if ((cslc != NULL) && (cslc->szD > CACHE_ENTRY_SIZE)) size = cslc->szD; 12) Please be explicit: + vg_assert(img); => vg_assert(img != NULL); +++ coregrind/m_debuginfo/priv_image.h 13) Make it a proper sentence: +/* Real size of the image */ => +/* Real size of the image. */ 14) Make it a proper sentence: + Returns (virtual) position in image from which decompressed data can be read */ => Returns (virtual) position in image from which decompressed data can be read. */ 15) Lines too long (exceed 80 chars): Returns (virtual) position in image from which decompressed data can be read */ DiOffT ML_(img_mark_compressed_part)(DiImage* img, DiOffT offset, SizeT szC, SizeT szD); +++ coregrind/m_debuginfo/readelf.c 16) Magic '12' is used multiple times, please make it a #define or a constant. + } else if (h->sh_size > 12) { 17) [multiple times] The exclamation mark is really unnecessary here, the whole message you get is scary per se (check other occurrences of ML_(symerr)() in this module). + ML_(symerr)(di, True, " Unknown compression type!"); \ +++ memcheck/tests/Makefile.am 18) Use @FLAG_W_NO_UNINITIALIZED@ (see configure.ac) instead of plain -Wno-uninitialized. +++ memcheck/tests/cdebug.c 19) A compiler could see that the program always returns 0 and might optimize 'if (x)' out. Perhaps you can return different values? 20) I don't see any coregrind/Makefile changes to build m_debuginfo/tinfl.c? +++ coregrind/m_debuginfo/tinfl.c 21) What kind of license your modifications fall under? tinfl.c seems to be under "UNLICENSE" whereas the rest of Valgrind is under GPLv2+. 22) We should use proper Valgrind types, not standard C ones here: typedef unsigned char mz_uint8; typedef signed short mz_int16; typedef unsigned short mz_uint16; typedef unsigned int mz_uint32; typedef unsigned int mz_uint; typedef unsigned long long mz_uint64; I was able to get it built on amd64/Solaris and regression testing went fine. However I don't have any system with toolchain supporting '-gz' at hand. I assume you tested on MIPS. Anyone can test on a different architecture or distribution?
Created attachment 98374 [details] Compressed debug sections support for Valgrind - rev 2
Created attachment 98375 [details] Test cases - rev2
(In reply to Ivo Raisr from comment #31) Thank You for reviewing. I have fixed all things that you mention. The full patch is attached, and the new version of tests is also attached. > 2) When including "tinfl.c", do we want to define "TINFL_HEADER_FILE_ONLY"? > 20) I don't see any coregrind/Makefile changes to build m_debuginfo/tinfl.c? In case we use #include "tinfl.c" without defining TINFL_HEADER_FILE_ONLY, we don't need tinfl.c in Makefile. If we include header only (by defining TINFL_HEADER_FILE_ONLY before #include) then tinfl.c needs to be compiled separately (this solution is applied in rev2 patch). > 5) CEnt.data has now non-fixed size. Why CACHE_ENTRY_SIZE is still used in > various places > around image.c; for example in alloc_CEnt() and realloc_CEnt()? CACHE_ENTRY_SIZE is still in use as default (and minimal) size of cache entry. Larger entries will be allocated in case size of the uncompressed data is grater than CACHE_ENTRY_SIZE. > However I don't have any system with toolchain supporting '-gz' at hand. > I assume you tested on MIPS. Anyone can test on a different architecture or > distribution? It seems that nobody has GCC which supports -gz, so the test is useless for now.
Created attachment 98420 [details] slightly modified patch rev3 Patch rev2 by Aleksandar slightly modified to include bug description in NEWS, fix some long lines in readelf.c and image.c and ignore built test executables for SVN.
Thank you, Aleksandar, for patches rev2. I have attached rev3 and run it on amd64/Solaris 12 and amd64/Ubuntu. Regression testing went ok. Actually Solaris 12 has working combination of gcc 5.3 and ld so that -gz=zlib-gnu is supported there and the executable has all sorts of .zdebug sections. So that test cases passes. For -gz=zlib I suppose we should wait a bit until it gets into distributions.
No regressions on AMD64/Darwin, for what that's worth, with compressed-003.patch
Dilyan Palauzov <dilyan.palauzov@aegee.org> reports duplicate line in configure.ac: AC_CHECK_TYPES([Elf32_Chdr, Elf64_Chdr], [], [], [[#include <elf.h>]])
Fixed in SVN r15868.
Follow up fix in SVN r15869.