Bug 280083 - mempolicy syscall check errors
Summary: mempolicy syscall check errors
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.7 SVN
Platform: Debian unstable Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-14 17:43 UTC by Brice Goglin
Modified: 2011-08-15 07:57 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
patch fixing mempolicy syscall checks (1.47 KB, patch)
2011-08-14 17:43 UTC, Brice Goglin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Brice Goglin 2011-08-14 17:43:41 UTC
Created attachment 62827 [details]
patch fixing mempolicy syscall checks

Version:           3.7 SVN (using Devel) 
OS:                Linux

Hello,

While debugging hwloc, I got the following valgrind report:
==7528== Syscall param mbind(nodemask) points to unaddressable byte(s)
==7528==    at 0x6623E3A: ??? (in /usr/lib/libnuma.so.1)
==7528==    by 0x6623EA6: mbind (in /usr/lib/libnuma.so.1)
==7528==    by 0x5048A77: hwloc_linux_set_area_membind (in /home/bgoglin/SOFT/hwloc/trunk/build/src/.libs/libhwloc.so.0.0.0)
==7528==    by 0x50402C5: hwloc_set_area_membind_nodeset (in /home/bgoglin/SOFT/hwloc/trunk/build/src/.libs/libhwloc.so.0.0.0)
==7528==    by 0x4059ED: main (in /home/bgoglin/SOFT/hwloc/trunk/build/utils/.libs/lt-lstopo)

A careful review of the mbind manpage and hwloc code confirms that hwloc is right. Valgrind is however doing an invalid check on nodemask. And there are actually 3 problems in the mempolicy section of coregrind/m_syswrap/syswrap-linux.c:

1) Compare the checks in
PRE(sys_mbind):
      PRE_MEM_READ( "mbind(nodemask)", ARG4,
                    VG_ROUNDUP( ARG5, sizeof(UWord) ) / sizeof(UWord) );
PRE(sys_set_mempolicy):
   PRE_MEM_READ( "set_mempolicy(nodemask)", ARG2,
                 VG_ROUNDUP( ARG3, sizeof(UWord) ) / sizeof(UWord) );
with PRE(sys_get_mempolicy):
      PRE_MEM_WRITE( "get_mempolicy(nodemask)", ARG2,
                     VG_ROUNDUP( ARG3, sizeof(UWord) * 8 ) / sizeof(UWord) );

The VG_ROUNDUP is wrong in mbind/set_mempolicy while it's OK in get_mempolicy. maxnode is the max index of a bit in nodemask, it's not the allocated nodemask size. So if you want to round up to the nearest multiple of bits per long, you want VG_ROUNDUP(..., sizeof(UWord)*8) instead of VG_ROUNDUP(..., sizeof(UWord)).
This affects PRE(sys_bind) and PRE(sys_set_mempolicy)

2) Dividing the result of VG_ROUNDUP by sizeof(UWord) is wrong too. You want to convert bits into bytes there. So you want to divide by 8 (or BITS_PER_BYTE if it exists). This works on 64bits by chance but it should break on 32bits (I don't have any machine to test this). This affects all PRE/POST in the mempolicy section.

3) The tricky one. maxnode should be X when nodemask may contain up to X bits (from 0 to X-1). That's what the manpages say (even if it's not very clear). However, the actual implementation in the kernel is different. It wants maxnode to be increased by 1 again. If you look at mm/mempolicy.c in the kernel (from 2.6.12 to 3.1rc1), you'll see that the maxnode/nodemask given to mbind and set_mempolicy are checked in get_nodes(). This function starts by doing maxnode-- and then does nothing if maxnode is 0. This means that if you want to set the first bit in nodemask, you need maxnode to be 2 or more. So the aforementioned checks should be updated again by decreasing maxnode by 1 in VG_ROUNDUP.

I am attaching a patch including all these fixes. Should apply to 3.6.1 (Debian testing) and 3.6.99svn11761 (Debian experimental).


Reproducible: Didn't try

Steps to Reproduce:
see above


Expected Results:  
see above
Comment 1 Tom Hughes 2011-08-15 07:57:38 UTC
Committed (with the extra -1 added to get_mempolicy) as r11977.