Bug 112731 - Valgrind 2.2 and 3.0.1 fail to catch a memory leak
Summary: Valgrind 2.2 and 3.0.1 fail to catch a memory leak
Status: RESOLVED WORKSFORME
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (other bugs)
Version First Reported In: 3.0.1
Platform: Fedora RPMs Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords: investigated, triaged
Depends on:
Blocks:
 
Reported: 2005-09-16 15:15 UTC by ken stanley
Modified: 2018-11-12 16:01 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ken stanley 2005-09-16 15:15:46 UTC
Background:

  Amesos, a Trilinos (http://software.sandia.gov/trilinos/) package provides an
interface to a number of third party codes, some of which are written in C. 
Amesos can be run in either serial or parallel (using MPI) and is tested nightly
on numerous platforms.
  We use valgrind to make Amesos more robust.  Indeed, we have used valgrind in
our nightly testing.

The bug:
  I can add a memory leak to an Amesos package which is not caught by valgrind
2.2 or 3.0.1 on my linux laptop running Fedora Core.  uname -a returns: "Linux
dhcppc7 2.6.11.7-lc1-smp #1 SMP Fri Apr 15 12:02:37 PDT 2005 i686 i686 i386
GNU/Linux"

  Valgrind catches this leak if Amesos is built in serial mode.

  Valgrind does not catch this leak if Amesos is built in parallel mode (i.e.
with MPI linked in), even if no calls to MPI are made in the code and the code
is run on only one process.

  Amesos provides both a direct interface
(http://software.sandia.gov/trilinos/packages/docs/r6.0/packages/amesos/doc/html/classAmesos__Klu.html)
and a factory interface 
(http://software.sandia.gov/trilinos/packages/docs/r6.0/packages/amesos/doc/html/classAmesos.html)
which adds another layer over the direct interface.

  Valgrind 2.2 and 3.0.1 catch the memory leak if called through the direct
interface.

  Valgrind does not catch the memory leak if called through the factory interface.

  I have been unable to reduce to this problem to a self contained problem.  I
have however learned that commenting out a single line of executable code deep
within the third party code (written in C) allows valgrind 2.2 on my laptop to
find this memory error.  Commenting out that line of code causes the third party
code to return the wrong result.  However, other changes which also cause the
third party code to fail do not allow valgrind 2.2 to find this memory error
(hence it appears that it is that code change that makes the difference, not
changes in conditional code execution down the line).



  Valgrind 2.0 running on an older linux catches this memory leak in all
situations that I have tried.  uname -a returns:  "2.0 Linux h... 2.4.20-8smp #1
SMP Thu Mar 13 17:45:54 EST 2003 i686 i686 i386 GNU/Linux" on this older linux box.

  I don't have root access to this older linux box, so I installed valgrind
3.0.1, compiled by from sources, in my own directory.  Valgrind 3.0.1 run from
this directory on this older linux box also catches this memory leak.


  Valgrind 2.2 running on a third linux box, does not catch this memory leak. 
uname -a returns "Linux b... #1 Fri Aug 12 15:45:00 CDT 2005 i686 AMD Athlon(tm)
 AuthenticAMD GNU/Linux" on this thrid box.   To the extent that I have tested
it, valgrind 3.0.1 acts the same as valgrind 2.2 with respect to this bug.  


  I have been unable to run valgrind 2.0 on either of the newer linux machines
on which valgrind does not catch this memory leak.  valgrind 2.0 does not
compile on my laptop.  I get this error (and other similar ones) using gcc 3.4.3
on my laptop:

mc_main.c:1210: error: conflicting types for 'vgMemCheck_fpu_write_check'
mc_include.h:132: error: previous declaration of 'vgMemCheck_fpu_write_check'
was here



I have been unable to run valgrind 2.0 on the third linux box (on which valgrind
2.2 also fails to catch this leak).  On that linux box, on which I do not have
root access and hence am running it out of my own directory, I get the following
error when I run valgrind 2.0:
  compare_solvers.exe: symbol lookup error:
/home/kstanley/bin2.0//lib/valgrind/libpthread.so.0: undefined symbol:
__libc_connect
Exit 127



I realize that there is quite a lot of information to absorb in this bug report.
 I hope that there is enough information to allow you to make suggestion on how
to attack this further.


I can give you instructions on how to recreate this bug, though they would
include downloading all of Trilinos 6.0, changing a couple lines, configure,
make and run.  This requires lam mpi which is standard on many, but not all,
linux distributions.

Please let me know how you would like to proceed on this, if you are interested. 

Thanks,
  Ken
Comment 1 Nicholas Nethercote 2009-07-01 09:17:34 UTC
I'm closing crashing and similar bugs that are more than two years old.  If
you still see this problem with Valgrind 3.4.1 please reopen the bug report.
Thanks.

(Although in this case, it's almost impossible to reproduce and therefore impossible to fix...)
Comment 2 Andrew Crouthamel 2018-09-19 04:31:32 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information as soon as possible and set the bug status as REPORTED. Due to regular bug tracker maintenance, if the bug is still in NEEDSINFO status with no change in 30 days the bug will be closed as RESOLVED > WORKSFORME due to lack of needed information.

For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please mark the bug as REPORTED so that the KDE team knows that the bug is ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 3 Bug Janitor Service 2018-11-12 16:01:56 UTC
This bug has been in NEEDSINFO status with no change for at least
30 days. The bug is now closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

Thank you for helping us make KDE software even better for everyone!