Bug 392855

Summary:	valgrind reports 1 additional allocation and 1024 additional bytes allocated (again...)
Product:	[Developer tools] valgrind	Reporter:	David Rankin <drankinatty>
Component:	general	Assignee:	Julian Seward <jseward>
Status:	RESOLVED NOT A BUG
Severity:	normal	CC:	821831728, philippe.waroquiers, tom
Priority:	NOR
Version First Reported In:	3.13.0
Target Milestone:	---
Platform:	Arch Linux
OS:	Linux
Latest Commit:		Version Fixed/Implemented In:
Sentry Crash Report:
Attachments:	xtmemory.kcg.16050

Description David Rankin 2018-04-08 04:13:02 UTC

Devs,

  This is an issue that appears to be a regression of a bug reported to Arch in June 2016 and has stayed fixed, but now it back again. The arch bug is:

https://bugs.archlinux.org/task/49681

Basically valgrind started reporting an additional allocation and an additional 1024 bytes of memory allocated (for just about every allocation). Within the last couple of days (I suspect related to changes in gcc-glibc) it now reports the correct number of allocations, but still reports at least 1024 additional bytes, e.g.

    #include <stdio.h>
    #include <stdlib.h>
    
    int main (void) {
    
        int *a = malloc (sizeof *a);
        *a = 5;
        printf ("a: %d\n", *a);
        free (a);
    }

valgrind reports:

valgrind ./bin/vgtest2
==15274== Memcheck, a memory error detector
==15274== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==15274== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==15274== Command: ./bin/vgtest2
==15274==
a: 5
==15274==
==15274== HEAP SUMMARY:
==15274== in use at exit: 0 bytes in 0 blocks
==15274== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==15274==
==15274== All heap blocks were freed -- no leaks are possible
==15274==
==15274== For counts of detected and suppressed errors, rerun with: -v
==15274== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

So instead of correctly reporting 4-bytes, valgrind now reports 1024 bytes. There appears to be a regression not properly handled by the exclusions.

Comment 1 Tom Hughes 2018-04-08 07:47:15 UTC

We just report what happens so I doubt this is a valgrind bug, especially if you say it changed without the version of valgrind changing?

Equally it can't really be a glibc bug, unless the malloc has become a macro that wraps the real function call, as we intercept the function call.

Most likely the compiler has decided to round up the allocation or something, but I think the compiled is the most likely place to look. Really given how small the program is I would just look at the generated assembly to see what is going on.

Comment 2 David Rankin 2018-04-30 19:55:34 UTC

There is definitely a problem with either valgrind or libc. For Example,

OpenSuSE Leap 42.3 correctly reports the number of allocations and the amount of memory allocated.

$ valgrind ./bin/vgtest2
==4142== Memcheck, a memory error detector
==4142== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==4142== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==4142== Command: ./bin/vgtest2
==4142==
a: 5
==4142==
==4142== HEAP SUMMARY:
==4142==     in use at exit: 0 bytes in 0 blocks
==4142==   total heap usage: 1 allocs, 1 frees, 4 bytes allocated
==4142==
==4142== All heap blocks were freed -- no leaks are possible
==4142==
==4142== For counts of detected and suppressed errors, rerun with: -v
==4142== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

That is with:
gcc --version
gcc (SUSE Linux) 4.8.5
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


The assembly generated is:

$ gcc -S -Ofast -o vgtest2.asm vgtest2.c

$ cat vgtest2.asm
        .file   "vgtest2.c"
        .section        .rodata.str1.1,"aMS",@progbits,1
.LC0:
        .string "a: %d\n"
        .section        .text.startup,"ax",@progbits
        .p2align 4,,15
        .globl  main
        .type   main, @function
main:
.LFB21:
        .cfi_startproc
        pushq   %rbx
        .cfi_def_cfa_offset 16
        .cfi_offset 3, -16
        movl    $4, %edi
        call    malloc
        movl    $5, %esi
        movq    %rax, %rbx
        movl    $5, (%rax)
        movl    $.LC0, %edi
        xorl    %eax, %eax
        call    printf
        movq    %rbx, %rdi
        call    free
        popq    %rbx
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc
.LFE21:
        .size   main, .-main
        .ident  "GCC: (SUSE Linux) 4.8.5"
        .section        .note.GNU-stack,"",@progbits

Now on Archlinux, valgrind incorrectly reports the amount of memory allocated:

$ gcc --version
gcc (GCC) 7.3.1 20180406
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ valgrind ./bin/vgtest2
==8639== Memcheck, a memory error detector
==8639== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==8639== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==8639== Command: ./bin/vgtest2
==8639==
a: 5
==8639==
==8639== HEAP SUMMARY:
==8639==     in use at exit: 0 bytes in 0 blocks
==8639==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==8639==
==8639== All heap blocks were freed -- no leaks are possible
==8639==
==8639== For counts of detected and suppressed errors, rerun with: -v
==8639== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

The generated assembly is different:
gcc -S -Ofast -o vgtest2.asm vgtest2.c

$ cat vgtest2.asm
        .file   "vgtest2.c"
        .text
        .section        .rodata.str1.1,"aMS",@progbits,1
.LC0:
        .string "a: %d\n"
        .section        .text.startup,"ax",@progbits
        .p2align 4,,15
        .globl  main
        .type   main, @function
main:
.LFB24:
        .cfi_startproc
        leaq    .LC0(%rip), %rdi
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        movl    $5, %esi
        xorl    %eax, %eax
        call    printf@PLT
        xorl    %eax, %eax
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc
.LFE24:
        .size   main, .-main
        .ident  "GCC: (GNU) 7.3.1 20180406"
        .section        .note.GNU-stack,"",@progbits


Now something is broken in the valgrind reporting of memory allocated. It can't work correctly -- and then not work correctly -- and ho, hum, something must have change.... Yes, there is a bug somewhere.

If this isn't a valgrind bug, then I would appreciate suggestions on who the proper owner for the bug is.

Comment 3 David Rankin 2018-04-30 20:08:48 UTC

Tom,

  I do think you were on to the cause, but this appears to be an issue that will need to be fixed between valgrind and libc. The assembly generated on openSuSE Leap 42.3 with gcc 4.8.5 shows a malloc, so I suspect this is where valgrind tabulates the amount of memory allocated.

  However on the new gcc 7.3, there is no 'call malloc' in the generated assembly which leaves valgrind to report some default allocation block size.

  Wherever this bug belongs, it need to be fixed. It makes no sense from a teaching standpoint to have to say: 

"Look at the valgrind output, but only look at the number of allocations and frees to validate you have freed the memory you allocated. You can't really look at the amount allocated because valgrind doesn't correctly report the amount of memory allocated anymore..."

  This tool needs to work correctly from that standpoint as well.

Comment 4 Tom Hughes 2018-04-30 20:29:01 UTC

All that's happened in the second case is that gcc has optimised out the whole thing as the nonsense that it is - it just puts 5 straight in the argument register and passes it to printf. That's what happens when you compile with optimisation.

So in the first case what you are seeing is your malloc, and in the second your malloc never happens but something else (likely the dynamic linker or C library run time) does do a 1024 byte malloc.

Comment 5 David Rankin 2018-05-02 02:55:10 UTC

But that is what I'm saying, with gcc 7.3 it is still broken with -O0, e.g.

$ gcc -S -O0 -o vgtest_noopt.asm vgtest2.c

$ cat vgtest_noopt.asm
        .file   "vgtest2.c"
        .text
        .section        .rodata
.LC0:
        .string "a: %d\n"
        .text
        .globl  main
        .type   main, @function
main:
.LFB5:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        subq    $16, %rsp
        movl    $4, %edi
        call    malloc@PLT
        movq    %rax, -8(%rbp)
        movq    -8(%rbp), %rax
        movl    $5, (%rax)
        movq    -8(%rbp), %rax
        movl    (%rax), %eax
        movl    %eax, %esi
        leaq    .LC0(%rip), %rdi
        movl    $0, %eax
        call    printf@PLT
        movq    -8(%rbp), %rax
        movq    %rax, %rdi
        call    free@PLT
        movl    $0, %eax
        leave
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFE5:
        .size   main, .-main
        .ident  "GCC: (GNU) 7.3.1 20180406"
        .section        .note.GNU-stack,"",@progbits

valgrind still reports the wrong memory allocation. This time 1028-bytes (showing the 4-bytes allocated for the int, but a magic 1024 more)

$ valgrind ./bin/vgtest_noopt
==17950== Memcheck, a memory error detector
==17950== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==17950== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==17950== Command: ./bin/vgtest_noopt
==17950==
a: 5
==17950==
==17950== HEAP SUMMARY:
==17950==     in use at exit: 0 bytes in 0 blocks
==17950==   total heap usage: 2 allocs, 2 frees, 1,028 bytes allocated
==17950==
==17950== All heap blocks were freed -- no leaks are possible
==17950==
==17950== For counts of detected and suppressed errors, rerun with: -v
==17950== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Thank you for looking at this.

Comment 6 Philippe Waroquiers 2018-05-06 21:40:53 UTC

Run with
--xtree-memory=full

This will produce a file xtmemory.kcg.<PID>
containing a.o. the information about all stack traces that
have lead to an allocation.

You should then be able to explain the report:
2 allocs, 2 frees, 1,028 bytes allocated

As said by Tom, the second blocks is very likely allocated by some runtime.
And then valgrind properly shows that.

Philippe

Comment 7 David Rankin 2018-09-03 06:19:28 UTC

Created attachment 114754 [details]
xtmemory.kcg.16050

Attached is the xtmemory.kcg.16050. The same problems continue in gcc (GCC) 8.2.0 with Valgrind-3.14.0.GIT and LibVEX. The problem appears to be the allocations for the printf family. This is running on Linux 4.18.5 (Archlinux) on AMD x86_64.

Comment 8 Philippe Waroquiers 2018-09-09 12:55:32 UTC

(In reply to David Rankin from comment #7)
> Created attachment 114754 [details]
> xtmemory.kcg.16050
> 
> Attached is the xtmemory.kcg.16050. The same problems continue in gcc (GCC)
> 8.2.0 with Valgrind-3.14.0.GIT and LibVEX. The problem appears to be the
> allocations for the printf family. This is running on Linux 4.18.5
> (Archlinux) on AMD x86_64.

Re-reading this bug, I am not sure to understand the problem you have.
If your program is using a library function, and this library function
allocates some memory (and this memory is freed at exit), then all
looks normal to me, and that is the expected behaviour of Valgrind/memcheck.

As I understand, the initial program compiled with optimisation did
only one single malloc (and free), due to printf.

When compiled without optimisation, it does 2 malloc/free:
  * one malloc/free (needed for internal printf work)
  * the malloc/free in your program.

So, in summary, all looks ok to me.
If glibc or whatever would not release the memory (which it does),
then that might give a memory leak, but again, this memory is all freed
(via glibc cleanup routine called by Valgrind).

So, I do not know what is supposed to be fixed.

Comment 9 David Rankin 2018-10-29 07:47:35 UTC

Phillip, after further discussion on StackOverflow about this bug and the problems with valgrind improperly reporting more memory allocated than the user allocated in his code, this is the best rundown I can give you:

For all time, on Linux/gcc, valgrind accurately reported the number of bytes allocated by the user and provided proper exclusion files to mask any additional system memory allocated by the system not explicitly allocated in a call to `malloc, calloc or realloc` by the user.

It properly showed this behavior on Archlinux, openSuSE, Ubuntu, Debian, Slack and any other distribution I ever ran valgrind on up through the 3.12 release. Beginning with the 3.13 release, the system allocated memory (not required by the user) was no longer properly excluded in the memory reported in-use by valgrind. This dramatically decreased the usefulness of valgrind as a tool for new users as the questions always surfaced, "What is this additional memory I didn't allocate", e.g. https://stackoverflow.com/questions/53026992/extra-allocs-valgrind?noredirect=1#comment92967899_53026992

All versions up through 3.12, for example reported the following for a simple example:

#include <stdio.h>
#include <stdlib.h>

int main (void) {

    int *a = malloc (sizeof *a);
    *a = 5;
    printf ("a: %d\n", *a);
    free (a);
}

Compile:

gcc -Wall -Wextra -pedantic -Wshadow -std=gnu11 -Ofast -o /tmp/bin/vgtest2 vgtest2.c

Version Info (openSuSE Leap 42.3)

gcc --version
gcc (SUSE Linux) 4.8.5
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Valgrind Output:

valgrind ./bin/vgtest2
==6686== Memcheck, a memory error detector
==6686== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==6686== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==6686== Command: ./bin/vgtest2
==6686==
a: 5
==6686==
==6686== HEAP SUMMARY:
==6686==     in use at exit: 0 bytes in 0 blocks
==6686==   total heap usage: 1 allocs, 1 frees, 4 bytes allocated
==6686==
==6686== All heap blocks were freed -- no leaks are possible
==6686==
==6686== For counts of detected and suppressed errors, rerun with: -v
==6686== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Perfect, 4-bytes request, 4-bytes allocated, 4-bytes reported by valgrind.

Now let's look at openSuSE Leap 15 which updated to valgrind 3.13, with

gcc --version
gcc (SUSE Linux) 7.3.1 20180323 [gcc-7-branch revision 258812]
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

(using the same source file compiled with the same options)

valgrind ./bin/vgtest2
==9642== Memcheck, a memory error detector
==9642== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==9642== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==9642== Command: ./bin/vgtest2
==9642==
a: 5
==9642==
==9642== HEAP SUMMARY:
==9642==     in use at exit: 0 bytes in 0 blocks
==9642==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==9642==
==9642== All heap blocks were freed -- no leaks are possible
==9642==
==9642== For counts of detected and suppressed errors, rerun with: -v
==9642== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

That's not right. The user only allocated 4-bytes of memory, not 1024-bytes. Whatever exclusion files used to correctly mask the 1020-bytes not requested by the user are no longer doing their job. And Note: this is a very rudimentary example, when checking allocations on a normal sized example program, it's not uncommon to have 10,000 additional bytes reported as allocated when the user may have only required 300-bytes.

Not let's turn to ArchLinux with valgrind version 3.14, the exact same problem of excess memory reporting by valgrind continues, e.g.

gcc --version
gcc (GCC) 8.2.1 20180831
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

(same code same compiler options)

valgrind ./bin/vgtest2
==23174== Memcheck, a memory error detector
==23174== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==23174== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==23174== Command: ./bin/vgtest2
==23174==
a: 5
==23174==
==23174== HEAP SUMMARY:
==23174==     in use at exit: 0 bytes in 0 blocks
==23174==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==23174==
==23174== All heap blocks were freed -- no leaks are possible
==23174==
==23174== For counts of detected and suppressed errors, rerun with: -v
==23174== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Again the reported memory is not the 4-bytes allocated in the program, but an additional 1020-bytes the user did not allocated.

And now we are seeing this problem manifest itself on programming site with "Where are the extra allocations coming from?" (see prior link above)

It all boils down to either:

==6686==   total heap usage: 1 allocs, 1 frees, 4 bytes allocated

The correct memory reporting for the code shown has always been shown as this up until 3.13. 

Beginning with 3.13, valgrind begin reporting far more memory than the user requested, e.g.

==23174==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated

The prior behavior is the desired and expected behavior for valgrind. That was the whole purpose behind the exclusion files -- show the user what memory he (or she) allocated and whether or not that memory was freed or still reachable.

In its current state 3.13 and 3.14 provide no correlation between the bytes allocated by the user and the bytes reported as in use.

I don't know how much more clear that can be made. Above are 3 examples, where 3.12 provides accurate accounting for the users allocation, and beginning 3.13 valgrind no longer accurately accounts for the memory allocated by the user. That's the bug.

Comment 10 Tom Hughes 2018-10-29 08:45:44 UTC

I don't believe we have ever attempted to suppress memory allocations done by the system libraries - you may have been lucky in not seeing any but I don't believe it has ever been our policy to suppress them.

In fact I don't think we've ever even had a suppression type that would stop memory allocations being counted in those statistics - leaks can be suppressed by they are still counted in the statistics of the amount of memory allocated.

The counts are the counts of allocations done by the whole of the program including all libraries that it uses and I'm not sure why anybody would find that surprising, or find the idea that the C library does some allocations internally hard to understand.

Comment 11 Tom Hughes 2018-10-29 08:52:49 UTC

For the record here is the backtrace for the 1024 byte malloc:

#0  __GI___libc_malloc (bytes=1024) at malloc.c:3012
#1  0x00007ffff7e2e8f0 in __GI__IO_file_doallocate (fp=0x7ffff7f7e780 <_IO_2_1_stdout_>) at filedoalloc.c:101
#2  0x00007ffff7e3ccd0 in __GI__IO_doallocbuf (fp=fp@entry=0x7ffff7f7e780 <_IO_2_1_stdout_>) at libioP.h:839
#3  0x00007ffff7e3bf58 in _IO_new_file_overflow (f=0x7ffff7f7e780 <_IO_2_1_stdout_>, ch=-1) at fileops.c:752
#4  0x00007ffff7e3b05f in _IO_new_file_xsputn (f=0x7ffff7f7e780 <_IO_2_1_stdout_>, data=<optimized out>, n=3) at libioP.h:839
#5  0x00007ffff7e0e345 in _IO_vfprintf_internal (s=0x7ffff7f7e780 <_IO_2_1_stdout_>, format=0x402010 "a: %d\n", ap=ap@entry=0x7fffffffcc80) at ../libio/libioP.h:839
#6  0x00007ffff7e1715a in __printf (format=<optimized out>) 

Which, as expected, is glibc allocating a I/O buffer for the stdout stream. If you unbuffer that stream before the print with:

setvbuf(stdout, NULL, _IONBF, 0);

then it goes away.

Maybe previously the stdout stream had it's buffer allocated during startup and valgrind didn't notice it and now it's done on demand when stdout is first used? You would still have seen the same effect for any other file you opened though.

Comment 12 Philippe Waroquiers 2018-10-30 21:21:33 UTC

(In reply to Tom Hughes from comment #10)
> I don't believe we have ever attempted to suppress memory allocations done
> by the system libraries - you may have been lucky in not seeing any but I
> don't believe it has ever been our policy to suppress them.
> 
> In fact I don't think we've ever even had a suppression type that would stop
> memory allocations being counted in those statistics - leaks can be
> suppressed by they are still counted in the statistics of the amount of
> memory allocated.
> 
> The counts are the counts of allocations done by the whole of the program
> including all libraries that it uses and I'm not sure why anybody would find
> that surprising, or find the idea that the C library does some allocations
> internally hard to understand.

I completely agree with Tom.

In fact, if we would want to fix this case, we would face 2 problems:
1. define for which libraries we need to 'hide' the malloc
  This report speaks about 'printf', but we have plenty of other
  'system libs' we might want to consider. For example, gethostbyname
   will maybe also allocate some memory and we could ask to hide it.
   Similarly, maybe pthread_attr_init could allocate some memory, and
   we could similarly have to hide it.
   And then we could do a similar reasoning for gtk library, Qt, gnome, kde,
   ...
   We will then have zillions of malloc-ed objects that will be 'hidden'
   in big programs.
     
2. what to do when a library does a malloc and one way or another, we need
   to mention this allocated memory e.g. in an error msg.
   Or we just have to report the amount of memory currently allocated
   by the program ?

I do not see how we could properly define a reasonable semantic for the above.

Users must be aware that even in simple programs, a lot of things are
happening inside explicitly or implicitly linked libraries.

I understand the wish to hide this for 'hello world' programs,
but that will give major difficulties for non trivial programs.

So, in summary: this is not a bug, as there is no way to improve this
without introducing significant implementation and conceptual difficulties,
a lot worse than the initial difficulty.

Comment 13 Philippe Waroquiers 2018-11-02 10:25:51 UTC

As discussed, we see no way to define a clean/implementable semantic to improve this, and the current behavior is the most appropriate.

We might add a section in the FAQ explaining something like: "Why is valgrind/memcheck showing allocations not done by my code."