Version: 3.7.0 (using KDE 1.2) OS: Linux The MPI_IN_PLACE symbol is not allowed in place of a send buffer for the MPI collectives in libmpiwrap.so. For example, the following call: int x = 1; MPI_Allreduce(MPI_IN_PLACE, &x, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD); produces the error: ==672== Unaddressable byte(s) found during client check request ==672== at 0x4C153FD: check_mem_is_defined_untyped (libmpiwrap.c:950) ==672== by 0x4C15E98: walk_type (libmpiwrap.c:688) ==672== by 0x4C1CD8C: PMPI_Allreduce (libmpiwrap.c:921) ==672== by 0x468998: main (bfs.cpp:346) ==672== Address 0x1 is not stack'd, malloc'd or (recently) free'd MPI treats the pointer value 1 as MPI_IN_PLACE, and does not try to dereference it. Reproducible: Always Steps to Reproduce: See details. Actual Results: See details. Expected Results: Treating the receive buffer in the call as both the send and receive buffers for the collective operation. I am using gcc 4.6.0 and Open MPI 1.5 on x64.
Hmm, it might be fastest if you fixed this, since it's not hard to do and you obviously have a test rig close to hand. In function WRAPPER_FOR(PMPI_Allreduce) I would *guess* you need to change check_mem_is_defined(sendbuf, count, datatype); to if (sendbuf != MPI_IN_PLACE) check_mem_is_defined(sendbuf, count, datatype); (I am saying this on the basis of zero knowledge of the meaning of MPI_IN_PLACE, so ymmv ..) On concern is, does MPI_IN_PLACE exist for older MPIs (v1.1) ? If not it will be necessary to conditionalise it somehow.
Hi, Digging an old bug (which is still present and just hit me) I can fix this and provide test cases, hopefully for all collectives in chapter 5 of the mpi-3.[01] document. One question I have, though, is how it would be possible to proceed with the non-blocking collectives introduced in MPI-3.0. The validity checks for the input can of course be made before posting the non-blocking collective call (e.g. MPI_Ibcast). However, the qualification of the output values would ideally have to wait for the MPI_Wait(|all|any) call to complete. Which would then open a can of worms. I don't see an easy way to do that in a binary compatible way. Ideas ? E.
I'm on it right now. I've got all collectives with and without MPI_IN_PLACE covered. I see the shadow_request mechanism which is about all I need to address the issue I raised about the nonblocking collectives. The only catch is with MPI_Igatherv, for instance, where the completion of a request is going to paint several memory areas. I'm considering changing a bit how the shadow_request table works, as follows: - forbid replacing an entry which is still marked in use. Such an event would look to me rather a bug than a desirable thing. - allow a 1-to-n mapping in there. ok ? E.
Created attachment 95071 [details] patch against valgrind-3.11.0/mpi/libmpiwrap.c
Created attachment 95072 [details] test case for MPI wrappers
Ok, here's a proposed patch against valgrind-3.11.0/mpi/libmpiwrap.c, as well as a test case. All collectives from chapter 5 are covered, but I haven't covered the whole MPI-3 document. There's a slight catch which may either be a gap in the MPI-3 document, or a misunderstanding on my part (completion of some non-blocking collectives and use of MPI_Get_count)
Is there a reason nothing was ever done bout Emmanuel's patch? We observed these errors in our LibMesh project where we make use of MPI_IN_PLACE.
(In reply to Alex from comment #7) > Is there a reason nothing was ever done bout Emmanuel's patch? We observed > these errors in our LibMesh project where we make use of MPI_IN_PLACE. I guess that this is due to a combination of: * there is not a lot of valgrind developers, so not much free time * not much (or zero?) knowledge of mpi in the valgrind developers (personally, my knowledge is zero). This is a non minor patch, so it would be better that someone with mpi knowledge does a review of the code and of the test case, and give feedback on a real application. That might make it ok to push in 3.15 ...
I got hit by this bug also. Emmanuel seems to have done most of the heavy lifting, but I understand it takes time to review. That being, it could be mentioned in the documentation, especially since there is already a paragraph about false positive with reduce operations.
I don't have enough experience with MPI to do a proper review. But commenting to make sure this isn't forgotten. The patch still applies almost as with just one small conflict against commit 7b1a2b1edd99f15e23be0d259498247367f1e457 ("Fix printf warning in libmpiwrap.c").
Hi. It's slightly funny to notice that this patch is now ten years old. If it still applies (I take your word for it), then it probably still has some value. What can be envisioned as a path forward? E.