Created attachment 52581 [details] patch to add mempool annotation to glib/gslice See the first and last lines of this snippet of valgrind output: ==21187== mempool_free(0x4f10900, 0x4f11000) ==21187== at 0x4C6A04D: g_slice_free1 (gslice.c:903) ==21187== by 0x4005E0: main (slice.c:10) ==21187== mempool_free(0x4f10900, 0x4f11000) freed chunk of 20 bytes client request: code 4d430001, addr 0x4F11000, len 16 client request: code 1306, addr 0x4F10900, len 82907136 ==21187== mempool_free(0x4f10900, 0x4f11000) ==21187== at 0x4C6A04D: g_slice_free1 (gslice.c:903) ==21187== by 0x4005F1: main (slice.c:11) ==21187== Invalid free() / delete / delete[] ==21187== at 0x4C6A04D: g_slice_free1 (gslice.c:903) ==21187== by 0x4005F1: main (slice.c:11) ==21187== Address 0x4f11000 is not stack'd, malloc'd or (recently) free'd First it reports that a chunk of 20 bytes at 0x4f11000 is freed, and then *immediately* afterwards it tells us that the address 0x4f11000 is not recently freed. It lies. (It's a valid complaint about a double-free, but the block tracking is wrong). I have similar issues when I deliberately write off the end of an allocated block: ==21187== mempool_alloc(0x4f10900, 0x4f11000, 20) ==21187== at 0x4C69652: g_slice_alloc (gslice.c:841) ==21187== by 0x4005B5: main (slice.c:7) ==21187== Invalid write of size 1 ==21187== at 0x4005CD: main (slice.c:9) ==21187== Address 0x4f11014 is not stack'd, malloc'd or (recently) free'd Surely if I allocate a 20-byte block at 0x4f11000 and then write to the immediately subsequent byte, I should be told that the invalid write was one byte after the allocation I just made -- I shouldn't be told that it is "not stack'd, malloc'd or (recently) free'd". I have patched glib to add Valgrind annotations for its 'gslice' slab allocator. Perhaps I've done something wrong, but I'm not sure what would cause this kind of misbehaviour from Valgrind. http://www.mail-archive.com/valgrind-users@lists.sourceforge.net/msg02045.html http://www.mail-archive.com/valgrind-users@lists.sourceforge.net/msg02051.html
IIUC, the problem is that, in both cases, the description of the invalid address .. Address 0x4f11000 is not stack'd, malloc'd or (recently) free'd which is a polite way to say "I haven't got a clue what this is", is, well, at best misleading, and in short completely wrong. Similarly in the mail-archive.com links, the messages you're getting identify the failing address as part of the superblock, which is unhelpful: you want to relate it to the individual blocks. I think these are both easily fixed. Basically the error-message-maker-upper looks through the allocated and recently freed blocks it knows about, to see if the failing address is in or anywhere near them. Unfortunately it doesn't bother to look through set of mempool blocks when doing this. That explains why 1. if the superblocks are allocated by malloc, then it incorrectly identifies a superblock as containing the address, and 2. if the superblocks are allocated by mmap, it fails to identify anything as containing the address I'll try and hack up a fix. First thing to do is concoct a small self contained test case, so it can be regtested without having to drag around glib goop. You don't happen to have one lying around, do you?
Created attachment 56319 [details] proposed patch
Dave, can you try the attached patch? It's not perfect but it is better than it was. I think it's as good as I can get it without introducing extra bookkeeping data structures. What it can do is * describe block overruns in terms of the blocks themselves both for malloc and mmap-backed pools * describe accesses after free and double frees in terms of the backing block for malloc-backed pools Unfortunately for mmap backed pools, accesses after free and double frees are not related to anything, because Memcheck doesn't keep track of who (which stack trace) mmap'd which pages. Some example error messages, from an as-yet uncommitted test program, follow. ------ out of range reads in malloc-backed pool ------ ==4698== Invalid read of size 1 ==4698== at 0x400FAE: test (mempool2.c:135) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== Address 0x51b10a7 is 1 bytes before a block of size 10 client-defined ==4698== at 0x400EF7: allocate (mempool2.c:108) ==4698== by 0x400F68: test (mempool2.c:130) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== ==4698== Invalid read of size 1 ==4698== at 0x400FC3: test (mempool2.c:136) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== Address 0x51b10b2 is 0 bytes after a block of size 10 client-defined ==4698== at 0x400EF7: allocate (mempool2.c:108) ==4698== by 0x400F68: test (mempool2.c:130) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== ------ out of range reads in mmap-backed pool ------ ==4698== Invalid read of size 1 ==4698== at 0x400FFC: test (mempool2.c:140) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== Address 0x4023007 is 1 bytes before a block of size 20 client-defined ==4698== at 0x400EF7: allocate (mempool2.c:108) ==4698== by 0x400F7D: test (mempool2.c:131) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== ==4698== Invalid read of size 1 ==4698== at 0x401013: test (mempool2.c:141) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== Address 0x402301c is 0 bytes after a block of size 20 client-defined ==4698== at 0x400EF7: allocate (mempool2.c:108) ==4698== by 0x400F7D: test (mempool2.c:131) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== ------ read free in malloc-backed pool ------ ==4698== Illegal memory pool address ==4698== at 0x4010A6: test (mempool2.c:145) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== Address 0x51b1040 is 0 bytes inside a block of size 32 alloc'd ==4698== at 0x4C27878: malloc (vg_replace_malloc.c:236) ==4698== by 0x400AFA: make_pool (mempool2.c:46) ==4698== by 0x400F23: test (mempool2.c:122) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== ------ read free in mmap-backed pool ------ ==4698== Illegal memory pool address ==4698== at 0x40114F: test (mempool2.c:150) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== Address 0x4022000 is not stack'd, malloc'd or (recently) free'd ==4698== ------ double free in malloc-backed pool ------ ==4698== Illegal memory pool address ==4698== at 0x4011F8: test (mempool2.c:155) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== Address 0x51b1040 is 0 bytes inside a block of size 32 alloc'd ==4698== at 0x4C27878: malloc (vg_replace_malloc.c:236) ==4698== by 0x400AFA: make_pool (mempool2.c:46) ==4698== by 0x400F23: test (mempool2.c:122) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== ------ double free in mmap-backed pool ------ ==4698== Illegal memory pool address ==4698== at 0x40128A: test (mempool2.c:159) ==4698== by 0x40130E: main (mempool2.c:174) ==4698== Address 0x4022000 is not stack'd, malloc'd or (recently) free'd ==4698== ------ done ------
Committed: fix r11509, regtest r11510.