Bug 275852 - valgrind uses all swap space and is killed with (SIGKILL)
Summary: valgrind uses all swap space and is killed with (SIGKILL)
Status: RESOLVED DUPLICATE of bug 250101
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: 3.6 SVN
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-06-16 22:49 UTC by drforbin6
Modified: 2011-06-19 11:16 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
message.log of oom-killer invoke (9.14 KB, application/octet-stream)
2011-06-17 21:47 UTC, drforbin6
Details
profile-heap=yes run (2.38 KB, text/plain)
2011-06-18 13:31 UTC, drforbin6
Details
--trace-malloc=yes --profile-heap=yes (51.07 KB, text/plain)
2011-06-18 21:31 UTC, drforbin6
Details
test program with malloc pattern similar to sagan (54.68 KB, text/plain)
2011-06-19 09:20 UTC, Philippe Waroquiers
Details
reworked patch (32.42 KB, text/plain)
2011-06-19 10:50 UTC, drforbin6
Details

Note You need to log in before you can comment on or make changes to this bug.
Description drforbin6 2011-06-16 22:49:04 UTC
Version:           3.6 SVN
OS:                Linux

running valgrind on an application it seems it uses all swap space (memory) and then gets killed by the kernel....

Reproducible: Sometimes

Steps to Reproduce:
only happens with certain applications...

i.e. debugging sagan
Comment 1 Tom Hughes 2011-06-16 23:38:18 UTC
When you run your application under valgrind it will use considerably more memory (probably at least double) than it does normally.

Do you have any reason to think that is something going on here other than the normal, increased, memory usage that you should expect to see when running under valgrind?

How much memory do you have? How much swap? How much memory does your application use normally when not running under valgrind?
Comment 2 drforbin6 2011-06-17 02:17:00 UTC
I have 2gb memory and 2 gig swap space...
The app has a small working set size..and a friend of mine is running 
valgrind fine in a vm
with 256meg and it runs to completion on the same app.


thanxs

Merlyn.




On 06/16/2011 05:39 PM, Tom Hughes wrote:
> https://bugs.kde.org/show_bug.cgi?id=275852
>
>
> Tom Hughes<tom@compton.nu>  changed:
>
>             What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                   CC|                            |tom@compton.nu
>
>
>
>
> --- Comment #1 from Tom Hughes<tom compton nu>   2011-06-16 23:38:18 ---
> When you run your application under valgrind it will use considerably more
> memory (probably at least double) than it does normally.
>
> Do you have any reason to think that is something going on here other than the
> normal, increased, memory usage that you should expect to see when running
> under valgrind?
>
> How much memory do you have? How much swap? How much memory does your
> application use normally when not running under valgrind?
>
Comment 3 Tom Hughes 2011-06-17 08:25:00 UTC
Well unless you can provide a test case, or at the very least tell us where we can get this program you are running valgrind on and what command line you are using, there really isn't going to be very much we can do about a bug this vague.
Comment 4 drforbin6 2011-06-17 08:31:27 UTC
On 06/17/2011 02:25 AM, Tom Hughes wrote:
> https://bugs.kde.org/show_bug.cgi?id=275852
>
>
>
>
>
> --- Comment #3 from Tom Hughes<tom compton nu>   2011-06-17 08:25:00 ---
> Well unless you can provide a test case, or at the very least tell us where we
> can get this program you are running valgrind on and what command line you are
> using, there really isn't going to be very much we can do about a bug this
> vague.
>
here is website
I'm helping with development.
http://sagan.softwink.com/download/

command line is valgrind ./sagan


do you want a strace or stdout dump?

also..I tried compiling valgrind from source as well as svn and I still 
get the (KILL) on memcheck and helgrind.
Comment 5 Philippe Waroquiers 2011-06-17 09:25:43 UTC
(In reply to comment #4)

> do you want a strace or stdout dump?
> 
> also..I tried compiling valgrind from source as well as svn and I still 
> get the (KILL) on memcheck and helgrind.

Depending on the "alloc" pattern done by the application, you might
have encountered the bug 250101.

You could try to run with --profile-heap=yes to get some more info
about who is using what memory.

If you detect high fragmentation, then it is worth trying the patch
in bug 250101.
Comment 6 drforbin6 2011-06-17 21:45:29 UTC
On 06/17/2011 02:31 AM, Drforbin wrote:
> On 06/17/2011 02:25 AM, Tom Hughes wrote:
>> https://bugs.kde.org/show_bug.cgi?id=275852
>>
>>
>>
>>
>>
>> --- Comment #3 from Tom Hughes<tom compton nu>   2011-06-17 08:25:00 ---
>> Well unless you can provide a test case, or at the very least tell us 
>> where we
>> can get this program you are running valgrind on and what command 
>> line you are
>> using, there really isn't going to be very much we can do about a bug 
>> this
>> vague.
>>
> here is website
> I'm helping with development.
> http://sagan.softwink.com/download/
>
> command line is valgrind ./sagan
>
>
> do you want a strace or stdout dump?
>
> also..I tried compiling valgrind from source as well as svn and I 
> still get the (KILL) on memcheck and helgrind.
>
>
>
>
>
>
hope this helps
Comment 7 drforbin6 2011-06-17 21:47:55 UTC
Created attachment 61098 [details]
message.log of oom-killer invoke
Comment 8 Philippe Waroquiers 2011-06-17 21:56:15 UTC
(In reply to comment #6)
> hope this helps
Not much :). The OOM log tells us it is out of memory, and is killing your
valgrind process, which is memory hungry.

If you could run valgrind with --profile-heap=yes,
that might give some more lights about where valgrind uses all this memory.

Philippe
Comment 9 drforbin6 2011-06-18 13:31:19 UTC
Created attachment 61113 [details]
profile-heap=yes run

Here you go....

hope this helps and thanxs again.
Comment 10 Julian Seward 2011-06-18 13:42:44 UTC
(In reply to comment #9)
> Created an attachment (id=61113) [details]
> profile-heap=yes run

This looks pretty bogus to me.  Firstly the heap profile shows
nothing like N gigabytes of allocation.  Secondly, Memcheck is
showing errors in your code.  What happens if you fix those first?
Comment 11 Philippe Waroquiers 2011-06-18 14:06:14 UTC
(In reply to comment #9)
> Created an attachment (id=61113) [details]
> profile-heap=yes run
> 
> Here you go....
> 
> hope this helps and thanxs again.

The following line indicates that you are very probably triggering
the bug 250101 (see comment 5 above).
In the client arena (i.e. the place where the memory for your program
is allocated by valgrind), you have used a maximum of about 34 Mb.
But to give these 34 Mb to your application, Valgrind
has asked to the kernel about 1254 Mb.

-------- Arena "client": 1254490112 mmap'd, 33578880/33578880 max/curr --------
   33,578,880 in        66: replacemalloc.cm.1


I suggest two things:
  1. You could try the patch fixing bug 250101
and/or
  2. run with   --trace-malloc=yes --profile-heap=yes 
    for further information about what alloc pattern your program is doing.
Comment 12 Philippe Waroquiers 2011-06-18 14:18:30 UTC
(In reply to comment #11)
> The following line indicates that you are very probably triggering
> the bug 250101 (see comment 5 above).
Note that as indicated by Julian, you should fix the errors reported by
memcheck.
E.g. if the  strlcat or strlcpy bugs in your code are causing a "huge"
concat of strings, then this might be the cause for the "realloc" pattern
which triggers the quadratic Valgrind memory aspect.

Or maybe simply the bug in strlcpy/strlcat might cause Gb to be allocated

Philippe
Comment 13 drforbin6 2011-06-18 20:49:50 UTC
On 06/18/2011 08:06 AM, Philippe Waroquiers wrote:
> https://bugs.kde.org/show_bug.cgi?id=275852
>
>
>
>
>
> --- Comment #11 from Philippe Waroquiers<philippe waroquiers skynet be>   2011-06-18 14:06:14 ---
> (In reply to comment #9)
>> Created an attachment (id=61113)
>   -->  (http://bugs.kde.org/attachment.cgi?id=61113) [details]
>> profile-heap=yes run
>>
>> Here you go....
>>
>> hope this helps and thanxs again.
> The following line indicates that you are very probably triggering
> the bug 250101 (see comment 5 above).
> In the client arena (i.e. the place where the memory for your program
> is allocated by valgrind), you have used a maximum of about 34 Mb.
> But to give these 34 Mb to your application, Valgrind
> has asked to the kernel about 1254 Mb.
>
> -------- Arena "client": 1254490112 mmap'd, 33578880/33578880 max/curr --------
>     33,578,880 in        66: replacemalloc.cm.1
>
>
> I suggest two things:
>    1. You could try the patch fixing bug 250101
> and/or
>    2. run with   --trace-malloc=yes --profile-heap=yes
>      for further information about what alloc pattern your program is doing.
>
first thanks for help..all of you...

Philippe,

I tried the patch as soon as you suggested..
no go...
I will try with other args and send results.
As far as fixing bugs...I'll try that as well.
Comment 14 drforbin6 2011-06-18 21:31:44 UTC
Created attachment 61126 [details]
--trace-malloc=yes --profile-heap=yes

New run with suggested args
Comment 15 Philippe Waroquiers 2011-06-19 09:15:35 UTC
(In reply to comment #14)
> Created an attachment (id=61126) [details]
> --trace-malloc=yes --profile-heap=yes
> 
> New run with suggested args

The trace malloc output confirms that you are encountering the 
"quadratic memory" bug 250101, and that the patch is supposed to fix it.

I have transformed the trace in a small program just doing the
same malloc/free pattern as the trace you attached.

Then I have run this program with an unpatched valgrind
=> same behaviour as you encounter (i.e. huuuuuuuuge mmap client arena).

Then I have run this program with the patched valgrind.
=> expected behaviour after the patch (i.e. reasonable mmap client arena).

I will attach 3 files:
* the small test program
* the output of valgrind --profile-heap=yes without the patch
* the output of valgrind --profile-heap=yes with the patch.

Can you retest your patched and unpatched valgrind with the small program
and check if you obtain similar output as what is attached ?

If the small program is ok (i.e. has a "small client arena mmap")
with the patched valgrind, but sagan is still not ok (big arena)
can you run sagan with the patched valgrind and the following args:
   --profile-heap=yes --trace-malloc=yes -d -d -d -v -v

Thanks
Comment 16 Philippe Waroquiers 2011-06-19 09:20:58 UTC
Created attachment 61134 [details]
test program with malloc pattern similar to sagan

*************** output without patch (followed by the output with patch)

==17119== Memcheck, a memory error detector
==17119== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==17119== Using Valgrind-3.7.0.SVN and LibVEX; rerun with -h for copyright info
==17119== Command: ./dforbin_fragment
==17119== 
-------- Arena "client": 50331648 mmap'd, 25094552/25094552 max/curr --------
   25,094,552 in        67: replacemalloc.cm.1

-------- Arena "client": 50331648 mmap'd, 27675032/27675032 max/curr --------
   27,675,032 in        63: replacemalloc.cm.1

-------- Arena "client": 390643712 mmap'd, 30502808/30502808 max/curr --------
   30,502,808 in        62: replacemalloc.cm.1

-------- Arena "client": 1254490112 mmap'd, 33578440/33578440 max/curr --------
   33,578,440 in        66: replacemalloc.cm.1

the end
-1921503232 int arena;    /* non-mmapped space allocated from system */
       410 int ordblks;  /* number of free chunks */
         0 int smblks;   /* number of fastbin blocks */
         0 int hblks;    /* number of mmapped regions */
         0 int hblkhd;   /* space in mmapped regions */
         0 int usmblks;  /* maximum total allocated space */
         0 int fsmblks;  /* space available in freed fastbin blocks */
   8949448 int uordblks; /* total allocated space */
-1930486688 int fordblks; /* total free space */
         0 int keepcost; /* top-most, releasable (via malloc_trim) space */

==17119== 
==17119== HEAP SUMMARY:
==17119==     in use at exit: 8,949,180 bytes in 87 blocks
==17119==   total heap usage: 972 allocs, 885 frees, 3,143,932,480 bytes allocated
==17119== 
==17119== LEAK SUMMARY:
==17119==    definitely lost: 8,293,820 bytes in 86 blocks
==17119==    indirectly lost: 0 bytes in 0 blocks
==17119==      possibly lost: 655,360 bytes in 1 blocks
==17119==    still reachable: 0 bytes in 0 blocks
==17119==         suppressed: 0 bytes in 0 blocks
==17119== Rerun with --leak-check=full to see details of leaked memory
==17119== 
==17119== For counts of detected and suppressed errors, rerun with: -v
==17119== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 12 from 8)
-------- Arena "core": 1048576 mmap'd, 117448/117448 max/curr --------
           16 in         1: stacks.rs.1
           64 in         1: main.mpclo.3
        1,992 in        81: errormgr.losf.4
        4,768 in       149: errormgr.losf.1
        5,392 in       148: errormgr.losf.2
        6,712 in       249: errormgr.losf.3
       32,968 in         6: gdbsrv
       65,536 in         1: di.syswrap-x86.azxG.1

-------- Arena "tool": 4194304 mmap'd, 23224/21928 max/curr --------
           40 in         1: commandline.sua.3
           48 in         2: commandline.sua.2
          136 in         4: hashtable.Hc.1
          168 in         1: initimg-linux.sce.5
          288 in         1: mc.cSVT.1 (sec VBit table)
          288 in         1: mc.iaLL.1
          352 in         1: mc.pr.2
          512 in        24: mc.resi.1
        1,216 in         8: errormgr.mre.2
        1,424 in        89: mc.cMC.1 (a MC_Chunk)
        2,048 in         1: main.gss.1
        3,088 in        88: mc.pr.1
       12,320 in         4: hashtable.Hc.2

-------- Arena "dinfo": 13549568 mmap'd, 11845192/11150416 max/curr --------
           16 in         1: redir.ahs.2
           32 in         2: di.redi.1
           64 in         2: redir.ahs.1
           96 in         6: redir.rnnD.4
          352 in         6: di.debuginfo.aDI.2
          424 in         2: di.ccCt.1
        1,528 in       113: redir.rnnD.3
        1,816 in       113: redir.rnnD.2
        1,968 in        53: redir.ri.1
        2,304 in         6: di.debuginfo.aDI.1
        3,616 in       113: redir.rnnD.1
      194,184 in         2: di.ccCt.2
      520,000 in         6: di.storage.addSym.1
    1,573,056 in        24: di.storage.addStr.1
    2,402,960 in         6: di.storage.addDiCfSI.1
    6,448,000 in         6: di.storage.addLoc.1

-------- Arena "client": -1921503232 mmap'd, 34116040/25346248 max/curr --------
   25,346,248 in        89: replacemalloc.cm.1

-------- Arena "demangle": 0 mmap'd, 0/0 max/curr --------

-------- Arena "exectxt": 1048576 mmap'd, 42624/42624 max/curr --------
        6,176 in         1: execontext.reh1
       36,448 in     1,234: execontext.rEw2.2

-------- Arena "errors": 65536 mmap'd, 320/320 max/curr --------
          320 in         8: errormgr.mre.1

-------- Arena "ttaux": 65536 mmap'd, 10968/10904 max/curr --------
       10,904 in        95: transtab.aECN.1




********* output with patch
==17115== Memcheck, a memory error detector
==17115== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==17115== Using Valgrind-3.7.0.SVN and LibVEX; rerun with -h for copyright info
==17115== Command: ./dforbin_fragment
==17115== 
-------- Arena "client": 50331648/50331648 max/curr mmap'd, 25094552/25094552 max/curr on_loan --------
   25,094,552 in        67: replacemalloc.cm.1

-------- Arena "client": 50331648/50331648 max/curr mmap'd, 27675032/27675032 max/curr on_loan --------
   27,675,032 in        63: replacemalloc.cm.1

-------- Arena "client": 80052224/80052224 max/curr mmap'd, 30450664/30450664 max/curr on_loan --------
   30,450,664 in        62: replacemalloc.cm.1

-------- Arena "client": 83136512/83136512 max/curr mmap'd, 33535576/33535576 max/curr on_loan --------
   33,535,576 in        66: replacemalloc.cm.1

the end
  74948608 int arena;    /* non-mmapped space allocated from system */
        32 int ordblks;  /* number of free chunks */
         0 int smblks;   /* number of fastbin blocks */
         0 int hblks;    /* number of mmapped regions */
         0 int hblkhd;   /* space in mmapped regions */
         0 int usmblks;  /* maximum total allocated space */
         0 int fsmblks;  /* space available in freed fastbin blocks */
   8954840 int uordblks; /* total allocated space */
  65986864 int fordblks; /* total free space */
         0 int keepcost; /* top-most, releasable (via malloc_trim) space */

==17115== 
==17115== HEAP SUMMARY:
==17115==     in use at exit: 8,949,180 bytes in 87 blocks
==17115==   total heap usage: 972 allocs, 885 frees, 3,143,932,480 bytes allocated
==17115== 
==17115== LEAK SUMMARY:
==17115==    definitely lost: 79,292 bytes in 85 blocks
==17115==    indirectly lost: 0 bytes in 0 blocks
==17115==      possibly lost: 8,869,888 bytes in 2 blocks
==17115==    still reachable: 0 bytes in 0 blocks
==17115==         suppressed: 0 bytes in 0 blocks
==17115== Rerun with --leak-check=full to see details of leaked memory
==17115== 
==17115== For counts of detected and suppressed errors, rerun with: -v
==17115== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 12 from 8)
-------- Arena "core": 1048576/1048576 max/curr mmap'd, 117464/117464 max/curr on_loan --------
           16 in         1: stacks.rs.1
           80 in         1: main.mpclo.3
        1,992 in        81: errormgr.losf.4
        4,768 in       149: errormgr.losf.1
        5,392 in       148: errormgr.losf.2
        6,712 in       249: errormgr.losf.3
       32,968 in         6: gdbsrv
       65,536 in         1: di.syswrap-x86.azxG.1

-------- Arena "tool": 4194304/4194304 max/curr mmap'd, 23264/20424 max/curr on_loan --------
           40 in         1: commandline.sua.3
           48 in         2: commandline.sua.2
          136 in         4: hashtable.Hc.1
          208 in         1: initimg-linux.sce.5
          288 in         1: mc.cSVT.1 (sec VBit table)
          288 in         1: mc.iaLL.1
          352 in         1: mc.pr.2
          512 in        24: mc.resi.1
          512 in         1: main.gss.1
        1,216 in         8: errormgr.mre.2
        1,424 in        89: mc.cMC.1 (a MC_Chunk)
        3,080 in        88: mc.pr.1
       12,320 in         4: hashtable.Hc.2

-------- Arena "dinfo": 13549568/13549568 max/curr mmap'd, 11845264/11150464 max/curr on_loan --------
           16 in         1: redir.ahs.2
           32 in         2: di.redi.1
           64 in         2: redir.ahs.1
          104 in         6: redir.rnnD.4
          400 in         6: di.debuginfo.aDI.2
          424 in         2: di.ccCt.1
        1,512 in       113: redir.rnnD.3
        1,840 in       113: redir.rnnD.2
        1,952 in        53: redir.ri.1
        2,304 in         6: di.debuginfo.aDI.1
        3,616 in       113: redir.rnnD.1
      194,184 in         2: di.ccCt.2
      520,000 in         6: di.storage.addSym.1
    1,573,056 in        24: di.storage.addStr.1
    2,402,960 in         6: di.storage.addDiCfSI.1
    6,448,000 in         6: di.storage.addLoc.1

-------- Arena "client": 83726336/74948608 max/curr mmap'd, 34125400/25351640 max/curr on_loan --------
   25,351,640 in        89: replacemalloc.cm.1

-------- Arena "demangle": 0/0 max/curr mmap'd, 0/0 max/curr on_loan --------

-------- Arena "exectxt": 1048576/1048576 max/curr mmap'd, 42624/42624 max/curr on_loan --------
        6,176 in         1: execontext.reh1
       36,448 in     1,234: execontext.rEw2.2

-------- Arena "errors": 65536/65536 max/curr mmap'd, 320/320 max/curr on_loan --------
          320 in         8: errormgr.mre.1

-------- Arena "ttaux": 65536/65536 max/curr mmap'd, 10968/10904 max/curr on_loan --------
       10,904 in        95: transtab.aECN.1
Comment 17 drforbin6 2011-06-19 10:49:11 UTC
Ok Philippe,

I got it working, it seems you were correct all along...
it was quadratic memory issue. When I installed patch last time I 
deleted big-alloc.post.exp-32bit
thinking it was for 32bit platforms. well I had to rework the patch to 
get patch to appy hunk because big-alloc.post.exp-32bit does not exist 
in current source.

find reworked patch









On 06/19/2011 03:15 AM, Philippe Waroquiers wrote:
> https://bugs.kde.org/show_bug.cgi?id=275852
>
>
>
>
>
> --- Comment #15 from Philippe Waroquiers<philippe waroquiers skynet be>   2011-06-19 09:15:35 ---
> (In reply to comment #14)
>> Created an attachment (id=61126)
>   -->  (http://bugs.kde.org/attachment.cgi?id=61126) [details]
>> --trace-malloc=yes --profile-heap=yes
>>
>> New run with suggested args
> The trace malloc output confirms that you are encountering the
> "quadratic memory" bug 250101, and that the patch is supposed to fix it.
>
> I have transformed the trace in a small program just doing the
> same malloc/free pattern as the trace you attached.
>
> Then I have run this program with an unpatched valgrind
> =>  same behaviour as you encounter (i.e. huuuuuuuuge mmap client arena).
>
> Then I have run this program with the patched valgrind.
> =>  expected behaviour after the patch (i.e. reasonable mmap client arena).
>
> I will attach 3 files:
> * the small test program
> * the output of valgrind --profile-heap=yes without the patch
> * the output of valgrind --profile-heap=yes with the patch.
>
> Can you retest your patched and unpatched valgrind with the small program
> and check if you obtain similar output as what is attached ?
>
> If the small program is ok (i.e. has a "small client arena mmap")
> with the patched valgrind, but sagan is still not ok (big arena)
> can you run sagan with the patched valgrind and the following args:
>     --profile-heap=yes --trace-malloc=yes -d -d -d -v -v
>
> Thanks
>
Comment 18 drforbin6 2011-06-19 10:50:02 UTC
Created attachment 61136 [details]
reworked patch
Comment 19 drforbin6 2011-06-19 11:16:32 UTC

*** This bug has been marked as a duplicate of bug 250101 ***