"valgrind --tool=addrcheck XYZ" tells me "Sorry. You could try using a tool that uses less memory; eg. addrcheck instead of memcheck." where XYZ uses 75MB right after startup. Valgrind 2.0.0 works, 2.2.0 doesn't. Any ideas? Should i stick to valgrind 2.0.0? Thanks Lenny
Can you please attach the full output when you give Valgrind the -v option? Thanks.
==12444== Addrcheck, a fine-grained address checker for x86-linux. ==12444== Copyright (C) 2002-2004, and GNU GPL'd, by Julian Seward et al. ==12444== Using valgrind-2.2.0, a program supervision framework for x86-linux. ==12444== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et al. ==12444== Valgrind library directory: /usr/lib/valgrind ==12444== Command line ==12444== ./XYZ ==12444== Startup, with flags: ==12444== --tool=memcheck ==12444== -v ==12444== --tool=addrcheck ==12444== --trace-children=yes ==12444== -- ==12444== Contents of /proc/version: ==12444== Linux version 2.4.18 (root@placebo) (gcc version 2.95.4 20011002 (Debian prerelease)) #1 SMP Thu Jul 24 09:26:10 CEST 2003 ==12444== Reading syms from XYZ (0x8048000) @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" ==12444== Reading syms from /lib/ld-2.3.2.so (0x3412A000) ==12444== object doesn't have a symbol table ==12444== object doesn't have any debug info ==12444== Reading syms from /usr/lib/valgrind/stage2 (0xB0000000) ==12444== Reading syms from /lib/ld-2.3.2.so (0xB1000000) ==12444== object doesn't have a symbol table ==12444== object doesn't have any debug info ==12444== Reading syms from /lib/libdl-2.3.2.so (0xB102E000) ==12444== object doesn't have a symbol table ==12444== object doesn't have any debug info ==12444== Reading syms from /lib/libc-2.3.2.so (0xB1031000) ==12444== object doesn't have a symbol table ==12444== object doesn't have any debug info ==12444== Reading syms from /usr/lib/valgrind/vgskin_addrcheck.so (0xB1365000) ==12444== Reading suppressions file: /usr/lib/valgrind/default.supp ==12444== REDIRECT soname:libc.so.6(__GI___errno_location) to soname:libpthread.so.0(__errno_location) ==12444== REDIRECT soname:libc.so.6(__errno_location) to soname:libpthread.so.0(__errno_location) ==12444== REDIRECT soname:libc.so.6(__GI___h_errno_location) to soname:libpthread.so.0(__h_errno_location) ==12444== REDIRECT soname:libc.so.6(__h_errno_location) to soname:libpthread.so.0(__h_errno_location) ==12444== REDIRECT soname:libc.so.6(__GI___res_state) to soname:libpthread.so.0(__res_state) ==12444== REDIRECT soname:libc.so.6(__res_state) to soname:libpthread.so.0(__res_state) ==12444== REDIRECT soname:libc.so.6(stpcpy) to *vgpreload_memcheck.so*(stpcpy) ==12444== REDIRECT soname:libc.so.6(strnlen) to *vgpreload_memcheck.so*(strnlen) ==12444== REDIRECT soname:ld-linux.so.2(stpcpy) to *vgpreload_memcheck.so*(stpcpy) ==12444== REDIRECT soname:ld-linux.so.2(strchr) to *vgpreload_memcheck.so*(strchr) ==12444== ==12444== Reading syms from /usr/lib/valgrind/vg_inject.so (0x34144000) ==12444== Reading syms from /usr/lib/valgrind/vgpreload_addrcheck.so (0x34149000) ==12444== Reading syms from UVW.so (0x34151000) @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" ==12444== Reading syms from STU.so (0x341AC000) @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" ==12444== Reading syms from ABC.so (0x342A0000) @@ unlikely looking definition in unparsed remains ";" @@ unlikely looking definition in unparsed remains ";" VG_(get_memory_from_mmap): newSuperblock's request for 1048576 bytes failed. VG_(get_memory_from_mmap): 31024261 bytes already allocated. Sorry. You could try using a tool that uses less memory; eg. addrcheck instead of memcheck.
It's strange that Addrcheck is running out of memory after having allocated only 31 MB. Can you run the following program under Addrcheck: int main(void) { read(0,0,1); } and when it pauses, look at /proc/<pid>/maps, where <pid> is the pid shown at the start of each line of Addrcheck's output, and post that output here? Also, the output of just 'cat /proc/self/maps' might help too. Thanks.
kudlinl@placebo:~$ cat /proc/21498/maps 08048000-08049000 r-xp 00000000 08:02 771500 /home/kudlinl/a.out 08049000-0804a000 rw-p 00000000 08:02 771500 /home/kudlinl/a.out 3412a000-34140000 r-xp 00000000 08:02 164341 /lib/ld-2.3.2.so 34140000-34141000 rw-p 00015000 08:02 164341 /lib/ld-2.3.2.so 34142000-34143000 rw-p 00000000 00:00 0 34144000-34145000 r-xp 00000000 08:02 1065193 /usr/lib/valgrind/vg_inject.so 34145000-34146000 rw-p 00000000 08:02 1065193 /usr/lib/valgrind/vg_inject.so 34147000-34148000 rw-p 00000000 00:00 0 34149000-3414f000 r-xp 00000000 08:02 1065204 /usr/lib/valgrind/vgpreload_addrcheck.so 3414f000-34150000 rw-p 00005000 08:02 1065204 /usr/lib/valgrind/vgpreload_addrcheck.so 34168000-3419e000 r-xp 00000000 08:02 999514 /usr/lib/libstdc++-3-libc6.2-2-2.10.0.so 3419e000-341af000 rw-p 00036000 08:02 999514 /usr/lib/libstdc++-3-libc6.2-2-2.10.0.so 341af000-341b1000 rw-p 00000000 00:00 0 341b2000-341d3000 r-xp 00000000 08:02 164587 /lib/libm-2.3.2.so 341d3000-341d4000 rw-p 00020000 08:02 164587 /lib/libm-2.3.2.so 341d5000-342fd000 r-xp 00000000 08:02 164580 /lib/libc-2.3.2.so 342fd000-34305000 rw-p 00127000 08:02 164580 /lib/libc-2.3.2.so 34305000-34308000 rw-p 00000000 00:00 0 9c5fd000-9c5ff000 rwxp 00000000 00:00 0 9c5ff000-9c600000 r-xp 00001000 00:00 0 9c600000-9c700000 ---p 00000000 00:00 0 9c700000-9c742000 rw-p 00100000 00:00 0 9c742000-affc0000 ---p 00102000 00:00 0 b0000000-b009e000 r-xp 00000000 08:02 1065206 /usr/lib/valgrind/stage2 b009e000-b009f000 rw-p 0009d000 08:02 1065206 /usr/lib/valgrind/stage2 b009f000-b01f9000 rw-p 00000000 00:00 0 b01fa000-b02fa000 rwxp 00000000 00:00 0 b02fb000-b03fb000 rwxp 00000000 00:00 0 b03fc000-b03fd000 rwxp 00000000 00:00 0 b03fe000-b0406000 rwxp 00000000 00:00 0 b0407000-b0417000 rwxp 00000000 00:00 0 b0421000-b0521000 rwxp 00000000 00:00 0 b0522000-b076c000 rwxp 00000000 00:00 0 b076d000-b0bb1000 rwxp 00000000 00:00 0 b0bb2000-b0bc2000 rwxp 00000000 00:00 0 b1000000-b1016000 r-xp 00000000 08:02 164341 /lib/ld-2.3.2.so b1016000-b1017000 rw-p 00015000 08:02 164341 /lib/ld-2.3.2.so b1017000-b1018000 rw-p 00000000 00:00 0 b102e000-b1030000 r-xp 00000000 08:02 164586 /lib/libdl-2.3.2.so b1030000-b1031000 rw-p 00002000 08:02 164586 /lib/libdl-2.3.2.so b1031000-b1159000 r-xp 00000000 08:02 164580 /lib/libc-2.3.2.so b1159000-b1161000 rw-p 00127000 08:02 164580 /lib/libc-2.3.2.so b1161000-b1365000 rw-p 00000000 00:00 0 b1365000-b1373000 r-xp 00000000 08:02 1065203 /usr/lib/valgrind/vgskin_addrcheck.so b1373000-b1374000 rw-p 0000d000 08:02 1065203 /usr/lib/valgrind/vgskin_addrcheck.so b1374000-b1476000 rw-p 00000000 00:00 0 bfffd000-c0000000 rwxp ffffe000 00:00 0 kudlinl@placebo:~$ cat /proc/self/maps 08048000-0804c000 r-xp 00000000 08:02 1032390 /bin/cat 0804c000-0804d000 rw-p 00003000 08:02 1032390 /bin/cat 0804d000-0806e000 rwxp 00000000 00:00 0 40000000-40016000 r-xp 00000000 08:02 164341 /lib/ld-2.3.2.so 40016000-40017000 rw-p 00015000 08:02 164341 /lib/ld-2.3.2.so 40017000-40018000 rw-p 00000000 00:00 0 4002e000-40156000 r-xp 00000000 08:02 164580 /lib/libc-2.3.2.so 40156000-4015e000 rw-p 00127000 08:02 164580 /lib/libc-2.3.2.so 4015e000-40161000 rw-p 00000000 00:00 0 bfffe000-c0000000 rwxp fffff000 00:00 0 Thanks
I retried it with a self compiled valgrind 2.2.0 (don't trust Debian). Memcheck still fails with VG_(get_memory_from_mmap): newSuperblock's request for 1048576 bytes failed. VG_(get_memory_from_mmap): 251090735 bytes already allocated. Addrcheck with VG_(get_memory_from_mmap): newSuperblock's request for 1048576 bytes failed. VG_(get_memory_from_mmap): 28927109 bytes already allocated. Is valgrind 2.2 known to work with apps using approx 300-400 MB ram?
300-400MB should be fine -- Addrcheck should be ok for up to almost 2GB. Your proc/pid/maps files look totally normal, so I'm perplexed what the problem is... how big is ABC.so?
Might this just be the kernel overcommit algorithm deciding not to let you have any more swap - there are a number of large dummy maps in place to control memory allocation which will be counting towards the possible swap usage. The CVS code might work better as it uses MAP_NORESERVE to avoid reserving swap space for those mappings.
It looks like its past where there are the big anonymous mappings. I wonder if its simply that ABC.so's debug info is too large to load at once.
At least it's not small for sure. How large is too large?
What does size -A *.so say?
Created attachment 8477 [details] size -A *.so They are different library names, i just renamed them in order to no tell too much about the project.
Yeah, the 230MByte .so's with 165Mbytes of debug info is going to cause problems. We need to change the debug info reader to not try and mmap the whole debug section at once.
I am seeing a similar issue when trying to check a big library with valgrind 2.2.0. Is there any workaround for this?
*** Bug 96228 has been marked as a duplicate of this bug. ***
I'm renaming this from "Addrcheck uses too much memory".
I have problem right at startup to load the debug info of a big executable (valgrind 3.0 RC1, on red-hat AS release 3). (what is failing is a big mmap needed to map the executable to read the debug info). I have done some changes in symtab.c to add some tracing and in case the mmap failed, symtab.c instead allocates memory using malloc, reads the file, and free the memory at the end. With this, valgrind can produce stack trace with debug info (e.g. for the stack trace of the memory lost). So, it seems a big mmap can fail, while a big malloc of the same size will succeed. Find below first the trace output by the modified symtab.c followed by a context diff of the changes I did to make it work. The code changes I did are trivial but are not very nice and should be rewritten properly (I do not know much about valgrind and I have not clear idea how to e.g. call "properly" read syscall etc. But if you are interested in this change and you have no time to make it clean, I can try to make a cleaner fix. Basically, I have added a Bool mmapped set to True if mmap was ok, and set to False if mmaped failed, but malloc succeeded. Then at the end of the func, mmapped is used to see if the memory must be freed or unmapped. ==13545== Memcheck, a memory error detector. ==13545== Copyright (C) 2002-2005, and GNU GPL'd, by Julian Seward et al. ==13545== Using LibVEX rev 1301, a library for dynamic binary translation. ==13545== Copyright (C) 2004-2005, and GNU GPL'd, by OpenWorks LLP. ==13545== Using valgrind-3.0.RC1, a dynamic binary instrumentation framework. ==13545== Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward et al. size of file is 186590057 mmap failed, error is: Cannot allocate memory will try malloc + read size of file is 106912 size of file is 2456905 ==13545== For more details, rerun with: -v ==13545== size of file is 7700 size of file is 45413 size of file is 213484 size of file is 14868 size of file is 2329604 size of file is 326800 size of file is 27904 size of file is 53392 size of file is 907664 size of file is 91040 size of file is 47024 size of file is 97712 size of file is 1571692 size of file is 85936 size of file is 31316 size of file is 80912 .... *** symtab.c Tue Jul 26 10:59:05 2005 --- ../../../../valgrind-3.0.RC1/coregrind/m_debuginfo/symtab.c Fri Aug 26 00:34:15 2005 *************** *** 1266,1275 **** Bool ok; Addr oimage; UInt n_oimage; Addr dimage = 0; UInt n_dimage = 0; struct vki_stat stat_buf; ! oimage = (Addr)NULL; if (VG_(clo_verbosity) > 1) VG_(message)(Vg_DebugMsg, "Reading syms from %s (%p)", si->filename, si->start ); --- 1266,1276 ---- Bool ok; Addr oimage; UInt n_oimage; + UInt read_n_oimage; Addr dimage = 0; UInt n_dimage = 0; struct vki_stat stat_buf; ! Bool mmapped; oimage = (Addr)NULL; if (VG_(clo_verbosity) > 1) VG_(message)(Vg_DebugMsg, "Reading syms from %s (%p)", si->filename, si->start ); *************** *** 1285,1290 **** --- 1286,1293 ---- } n_oimage = stat_buf.st_size; + printf ("size of file is %d\n", n_oimage); + fd = VG_(open)(si->filename, VKI_O_RDONLY, 0); if (fd.isError) { ML_(symerr)("Can't open .so/.exe to read symbols?!"); *************** *** 1294,1306 **** oimage = (Addr)VG_(mmap)( NULL, n_oimage, VKI_PROT_READ, VKI_MAP_PRIVATE|VKI_MAP_NOSYMS, 0, fd.val, 0 ); VG_(close)(fd.val); if (oimage == ((Addr)(-1))) { ! VG_(message)(Vg_UserMsg, "warning: mmap failed on %s", si->filename ); ! VG_(message)(Vg_UserMsg, " no symbols or debug info loaded" ); ! return False; } /* Ok, the object image is safely in oimage[0 .. n_oimage-1]. --- 1297,1327 ---- oimage = (Addr)VG_(mmap)( NULL, n_oimage, VKI_PROT_READ, VKI_MAP_PRIVATE|VKI_MAP_NOSYMS, 0, fd.val, 0 ); + if (oimage == ((Addr)(-1))) { + perror ("mmap failed, error is"); + printf ("will try malloc + read\n"); + oimage = malloc (n_oimage); + if (oimage == NULL) { + perror ("malloc failed"); + return False; + } + read_n_oimage = read (fd.val, oimage, n_oimage); + if (read_n_oimage < n_oimage) { + perror ("read failed"); + printf ("read size %d\n", read_n_oimage); + return False; + } + mmapped = False; + } + else + mmapped = True; VG_(close)(fd.val); if (oimage == ((Addr)(-1))) { ! VG_(message)(Vg_UserMsg, "warning: mmap failed on %s", si->filename ); ! VG_(message)(Vg_UserMsg, " no symbols or debug info loaded" ); ! return False; } /* Ok, the object image is safely in oimage[0 .. n_oimage-1]. *************** *** 1637,1643 **** m_res = VG_(munmap) ( (void*)dimage, n_dimage ); vg_assert(0 == m_res); } ! m_res = VG_(munmap) ( (void*)oimage, n_oimage ); vg_assert(0 == m_res); return res; } --- 1658,1670 ---- m_res = VG_(munmap) ( (void*)dimage, n_dimage ); vg_assert(0 == m_res); } ! if (mmapped) ! m_res = VG_(munmap) ( (void*)oimage, n_oimage ); ! else ! { ! free (oimage); ! m_res = 0; ! } vg_assert(0 == m_res); return res; }
> size of file is 186590057 > mmap failed, error is: Cannot allocate memory > [...] > + if (oimage == ((Addr)(-1))) { > + perror ("mmap failed, error is"); > + printf ("will try malloc + read\n"); That's not an unreasonable thing to try. But it hides the real problem. Why would mmap of a 186M file fail soon after startup (when not much memory is in use) ? Something is screwy in the address space management.
> That's not an unreasonable thing to try. But it hides the > real problem. Why would mmap of a 186M file fail soon after > startup (when not much memory is in use) ? Something is screwy > in the address space management. It is not clear to me why mmap fails. I have written a very small executtable that just opens this file, and calls mmap on it, and it works ok. Is there anything I can do to help pinpoint what is screwy in the address space management ?
The mmap fails because it is not a normal mmap, it is a VG_(mmap) which will constrain the allocation to the valgrind part of the address space and if that is exhausted it will fail. Using malloc is not a solution because it allows the mapping to be in the client address space which breaks the separation of client data and valgrind data. It also introduces a libc dependency and we're trying to get rid of those.
> Using malloc is not a solution because it allows the mapping to be in the > client address space which breaks the separation of client data and valgrind > data. It also introduces a libc dependency and we're trying to get rid of > those. I imagine that when valgrind is busy reading symbol tables, that the client is "blocked/idle/stopped". So, the malloc could be replaced by: use brk and sbrk to extend the memory available at the end of the heap read the file in this memory. process the symbol table reduce the memory by calling brk/sbrk again As I imagine that the client is not doing anything during that time, the client cannot corrupt valgrind memory. Disclaimer: as I do not know the real reason to separate valgrind data from client data, and to avoid dependencies to glibc, and as I do know almost nothing about valgrind implementation, the above is very probably stupid, but I will surely learn something from your reply :).
Well valgrind doesn't use brk/sbrk at all - in fact it deliberately disables them so that mmap is used for all memory allocation ;-) You're right that it might well be safe to use an mmap that spans the entire address space here given that we release it again before the client does anything. I'm not sure what happens with multiples threads though - whether if one thread does a dlopen we might have one thread still running on the simulated CPU while another one is reading a symbol table. The real solution is the reworked address space manager that Julian is looking at I suspect.
> The real solution is the reworked address space manager that Julian is > looking at I suspect. Yup. At this very instant in fact.
> > The real solution is the reworked address space manager that Julian is > > looking at I suspect. > Yup. At this very instant in fact. Between an horrible hack proposed by a valgrind ignorant, and a reworked address space manager by a valgrind master, the choice is clear :). I just have one more question: I understand the interest of separaring valgrind and client data (avoid cross corruption of data structures in case of bug, I would guess). What is the reason to avoid a libc depedency ?
> What is the reason to avoid a libc depedency ? The short answer is that Valgrind has to keep a very tight control on things, and the less external code it depends on, the fewer nasty surprises can occur.
Because you've effectively got two programs (valgrind and the client) running as part of a single process you have to be very careful. Originally there was only one libc so valgrind couldn't use it as it might conflict with the client. The current implementation means that two copies of libc are loaded, one for valgrind and one for the client, but as they will both ask the kernel to do things on behalf of the process there is still a need to be careful and it is better if valgrind relies on nothing and does everything itself so we know exactly what is happening and can avoid any sort of conflict with the client program.
> What is the reason to avoid a libc depedency ? The problem is, glibc behaves like it is in control of some of the basic aspects of any process it is part of, such as address space layout, dynamic linking, etc. But in this application (V), V itself has to control all those things in order to make the simulation of the program-to-be-debugged work reliably. So our strategy is to get rid of glibc completely and have our own replacements for the functionality which we need. Then everything can be under control of Valgrind :-)
Valgrind's address space management has been completely overhauled recently. Kudling, Phillipe, Dominik: could you try the current code in the subversion repository? See http://www.valgrind.org/devel/cvs_svn.html for instructions. Thanks.
We have downloaded valgrind from svn, compiled it. With this last version, valgrind can properly load the debug info of a big executable (55 Mb of text). A few side notes: * I am not sure, but it looks to me that I have now some errors reported by valgrind that looks like false positive, when writing to a piece of shared memory. * I have started to run all tests, but the tests did not reach the end (disk full). I will relaunch these now, but I have the impression that the new valgrind can run more tests (previous version was reporting: "not enough memory, use another tool" earlier, I think). But it takes an very big time to run all our tests (it took 24 hours of cpu time on a 3.7 Ghz cpu to have the disk full :). So, in summary: * the new address space manager is solving the debug info problem * maybe some new (false positive) errors in the area of shared memory * maybe new valgrind can run more tests (i.e. needs less memory). Whenever I have more definitive answers for the last 2 "maybe", I will get back to you. Thanks for the nice work ...
I'm marking this as closed. Thanks for your help. If the false positive errors continue for the shared memory, please open a new bug.