Bug 392373

Summary: Valgrind could have an option to search for pointers at unaligned addresses
Product: [Developer tools] valgrind Reporter: Salvatore Sanfilippo <antirez>
Component: memcheckAssignee: Julian Seward <jseward>
Status: REPORTED ---    
Severity: normal CC: philippe.waroquiers
Priority: NOR    
Version First Reported In: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description Salvatore Sanfilippo 2018-03-26 16:51:22 UTC
Memcheck determines if the block is reachable from pointers within the root-set, checking for pointed-sized memory locations that are stored into an *aligned* address. This may leads to false positives in case some code, for the sake of memory efficiency or representation locality, packs pointers with other data, effectively storing pointers at unaligned addresses. Note that using memcpy() instead of deferencing the pointer, such a memory layout is perfectly safe even in architectures that do not have support for accessing unaligned memory locations: anyway when this is possible, compilers will translate memcpy with a fixed pointer-size "count" to a load/store operation instead of actually calling the function, so there is not lack of performances most of the times.
An example of code using this trick is an hash table where, instead of storing at each dictionary bucket a single value, we store a blob of prefixed-len keys followed by the respective pointer in a single allocated memory blob: "[3]foo<ptr-for-foo-value>[5]apple<ptr-for-apple-value>". This is similar to chaining but without paying the pointers and new bucket memory cost. In this case, Valgrind will not be able to look for the pointers when they are stored at unaligned offsets, reporting the leak. I imagine that scanning for aligned addresses is faster, because the search space is reduced to 1/8 of what would be otherwise. However this feature could just be opt-in, so that normally Valgrind users will not incur in any penalty. I have no expertise on the internals of Valgrind, but I can imagine that this feature could also be quite simple to add, if there is to change just the step of the search. Thanks for checking this issue. Regards, Salvatore.
Comment 1 Philippe Waroquiers 2018-03-26 19:36:15 UTC
I think it would not be very difficult to change this, it is not just
the step to change, but the work should be limited (one day of work maybe?
and then some tests :).

However, there are some consequences: the leak search will be significantly
slower when using the option, and will probably slowdown somewhat
the leak search even when not using the option.
It will also increase the false negative leaks, as there will be more chances
to find a sequence of bytes matching a leaked block (or pointing inside
a leaked blocked, giving a possibly leaked).
Comment 2 Salvatore Sanfilippo 2018-03-26 20:26:11 UTC
Thanks for your comments Philippe! Glad that implementing this feature would not require too much work. About the false negatives, indeed that's possible, but fortunately only projects with this requirements would pay such a price, and with 64 bit pointers the probability of a collision is pretty low (but not as low as finding a random 8 bytes string, which is a negligible probability. Because most pointers in normal systems will use common byte values for the most significant part, like 0xff). I'm more worried about the overhead when the feature is not used, I hope it is kinda non measurable or extremely low so that the option can be considered for inclusion. Cheers.