Summary: | valgrind throws std::bad_alloc on memory allocations larger than 34255421416 bytes | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | tgray26 |
Component: | general | Assignee: | Julian Seward <jseward> |
Status: | RESOLVED FIXED | ||
Severity: | crash | CC: | ivosh, rhyskidd, tom |
Priority: | NOR | ||
Version: | 3.10.0 | ||
Target Milestone: | --- | ||
Platform: | RedHat Enterprise Linux | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Attachments: |
test case (allocgig.c)
Proposed fix (so far, Linux only) Proposed fix (so far, Linux and Solaris only) |
Description
tgray26
2016-09-09 15:44:09 UTC
Yes there is a compiled in memory limit imposed by the addressing scheme valgrind uses for it's shadow memory. See https://stackoverflow.com/questions/8644234/why-is-valgrind-limited-to-32-gb-on-64-bit-architectures for more information and how to patch valgrind to allow larger address spaces. In the trunk right now we have N_PRIMARY_BITS = 20, which according to the svn log makes the maximum usable memory amount be 64G. That was done at end-Jan 2013 and should surely be in 3.10 and later. Maybe we should bump this up to 21 bits, hence giving 128G usable memory on 64 bit targets? It would slow down startup a bit because that array needs to be zeroed out, and would soak up a bit more memory, but otherwise seems harmless. Presumably at some point we can outrun (the ever decelerating) Moore's law with this game ;-) The primary_map array, I mean. I didn't mean the whole 128GB needs to be zeroed out at startup. I fully agree. Server systems these days have even TBs of memory to play with. In addition to initializing primary map, N_PRIMARY_BITS come into play also in mc_expensive_sanity_check(). Hopefully it won't be a big deal. I had hoped to do this for 3.12.0, but after looking at the #ifdef swamp in VG_(am_startup) that sets aspacem_maxAddr, I think it is too risky, because of the number of different cases that need to be verified. So I'd propose to leave it till after the release. The number of users that this will affect is tiny and those that really need it in 3.12.x can cherry pick the trunk commit into their own custom 3.12.x build, once we fix it on the trunk. Created attachment 105548 [details]
test case (allocgig.c)
Created attachment 105549 [details]
Proposed fix (so far, Linux only)
Fails at 32Gb without patch: trying for 31 GB .. ==12078== Warning: set address range perms: large range [0x3960c040, 0x7f960c040) (defined) .. OK ==12078== Warning: set address range perms: large range [0x3960c028, 0x7f960c058) (noaccess) trying for 32 GB .. allocgig: allocgig.c:15: main: Assertion `p' failed. ==12078== ==12078== Process terminating with default action of signal 6 (SIGABRT) ==12078== at 0x4E6E428: raise (raise.c:54) ==12078== by 0x4E70029: abort (abort.c:89) ==12078== by 0x4E66BD6: __assert_fail_base (assert.c:92) ==12078== by 0x4E66C81: __assert_fail (assert.c:101) ==12078== by 0x400700: main (in /home/tomh/allocgig) and at 64Gb with the patch: trying for 63 GB .. ==14020== Warning: set address range perms: large range [0x10060c4040, 0x1fc60c4040) (defined) .. OK ==14020== Warning: set address range perms: large range [0x10060c4028, 0x1fc60c4058) (noaccess) trying for 64 GB .. allocgig: allocgig.c:15: main: Assertion `p' failed. ==14020== ==14020== Process terminating with default action of signal 6 (SIGABRT) ==14020== at 0x4E6E428: raise (raise.c:54) ==14020== by 0x4E70029: abort (abort.c:89) ==14020== by 0x4E66BD6: __assert_fail_base (assert.c:92) ==14020== by 0x4E66C81: __assert_fail (assert.c:101) ==14020== by 0x400700: main (in /home/tomh/allocgig) Created attachment 105561 [details]
Proposed fix (so far, Linux and Solaris only)
Solaris changes.
Unfortunately I do not have a machine with >32 GB of physical memory where I can install Solaris and try it out. Solaris does not overcommit when allocating memory. Regression tests passed ok. Solaris and Linux limit increased to 128GB in r16381. OSX is so far unchanged. Rhys, do you want to change OSX too? I think nothing will break if OSX isn't changed. So, as you like. Closing, for now. If we need to change the OSX limits later then, well, fine. Any changes for macOS can come later. |