Bug 303963 - strstr() function produces wrong results under valgrind callgrind
Summary: strstr() function produces wrong results under valgrind callgrind
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.7.0
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-23 14:56 UTC by Moritz Hassert
Modified: 2012-07-27 16:49 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
minimal test case (1.14 KB, text/x-csrc)
2012-07-23 14:56 UTC, Moritz Hassert
Details
Patch for VEX (1.96 KB, patch)
2012-07-24 16:23 UTC, Josef Weidendorfer
Details
Patch for VG (test case) (4.12 KB, patch)
2012-07-24 16:24 UTC, Josef Weidendorfer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Moritz Hassert 2012-07-23 14:56:22 UTC
Created attachment 72705 [details]
minimal test case

$valgrind --version
valgrind-3.7.0

When an application that uses the strstr() function from the C standard library is profiled with valgrind --tool=callgrind, the strstr() function produces false results (at least) under the following conditions:
* the string s1 to search in and the string s2 to search for are exact duplicates, that is strcmp(s1,s2)==0. s1 and s2 don't need to be pointing to the same memory object.
* the string length (excluding terminating zero) is a multiple of 16

Expected result: strstr(s1,s2) returns s1, indicating a match at the first charactor of s1
What happens: strstr(s1,s2) returns NULL, indicating no matching substring was found.

See attached minimal testcase for an example. Reproduce under Ubuntu 12.04 with the following steps:
$gcc strstrtest.c -o strstrtest
$./ strstrtest # <-- should report no errors
$valgrind --tool=callgrind ./ strstrtest # <-- should report errors for lengths multiple of 16

- The Problem does not show up under valgrind-3.6.0.SVN-Debian from Ubuntu 10.04 Lucid Lynx
- The Problem does not show up under tool=memcheck.

Some more info:
OS: Ubuntu 12.04 Precise Pangolin
$uname -a
Linux mhassert 3.2.0-26-generic #41-Ubuntu SMP Thu Jun 14 17:49:24 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Comment 1 Josef Weidendorfer 2012-07-23 15:47:13 UTC
I actually can reproduce this, both with VG 3.7.0 and VG 3.8.0.SVN.
This looks scary.

The strange thing is that callgrind just runs the original code, and does not
do any tricky redirections similar to memcheck. Therefore this bug also
happens with the tools none and cachegrind (just checked). It at least
looks like something bad is going on translating strstr machine code
in VEX.

/usr/bin/valgrind --tool=none ./strstrtest
==21970== Nulgrind, the minimal Valgrind tool
==21970== Copyright (C) 2002-2011, and GNU GPL'd, by Nicholas Nethercote.
==21970== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==21970== Command: ./strstrtest
==21970== 
TEST1: strstr failed
TEST2: length 16: failed
...

Changing to component "general".
Comment 2 Josef Weidendorfer 2012-07-23 16:03:27 UTC
The bug obviously happens in __strstr_sse42 (on 64bit).
Comment 3 Josef Weidendorfer 2012-07-23 17:25:12 UTC
I just compared the control flow inside __strstr_sse42 on a real machine
(using gdb) and within valgrind (just running callgrind with "--collect-jumps=yes").

It looks like the VEX emulation of "pcmpistri" (used with mode "equal ordered")
is buggy.
Comment 4 Julian Seward 2012-07-23 20:30:40 UTC
(In reply to comment #3)
 > It looks like the VEX emulation of "pcmpistri" (used with mode "equal
> ordered") is buggy.

Josef, I am not surprised to hear that (+ thanks for chasing it).  Which pcmpistri case
is it (iow, which immediate byte) ?
Comment 5 Josef Weidendorfer 2012-07-23 20:47:43 UTC
(In reply to comment #4)
> Josef, I am not surprised to hear that (+ thanks for chasing it).  Which
> pcmpistri case
> is it (iow, which immediate byte) ?

It's imm8 = 0xc. I think I found the problem:

There is a discrepancy between real hardware and emulation if the needle is an empty string, ie. starts with "\0". This happens on the last call to pcmpistri in __strstr_sse42 when the needle length is a multiple of 16.

For all positions in the haystack, VEX stops search whenever the end of the needle is found. As it starts with assuming "no hit found", all search results will be false.
In contrast to that, according to table 4-3 in the SDM, the real pcmpistri
starts with the assumption "hit found".

Patch, which makes the test case of this bug report work
(needs also be changed for the _wide variant):


--- a/VEX/priv/guest_generic_x87.c
+++ b/VEX/priv/guest_generic_x87.c
@@ -891,7 +891,7 @@ Bool compute_PCMPxSTRx ( /*OUT*/V128* resV,
       UInt   ni, hi;
       UChar* argL    = (UChar*)argLV;
       UChar* argR    = (UChar*)argRV;
-      UInt   boolRes = 0;
+      UInt   boolRes = 0xFFFF;
       UInt   validL  = ~(zmaskL | -zmaskL);  // not(left(zmaskL))
       UInt   validR  = ~(zmaskR | -zmaskR);  // not(left(zmaskR))
       for (hi = 0; hi < 16; hi++) {
@@ -905,7 +905,7 @@ Bool compute_PCMPxSTRx ( /*OUT*/V128* resV,
             if (i >= 16) break;
             if (argL[i] != argR[ni]) { m = 0; break; }
          }
-         boolRes |= (m << hi);
+         if (m==0) boolRes &= ~(1 << hi);
       }

Of course it would be nice to add some test cases.
Comment 6 Julian Seward 2012-07-24 09:18:43 UTC
The diff below adds test cases. 


Index: none/tests/amd64/pcmpstr64.c
===================================================================
--- none/tests/amd64/pcmpstr64.c	(revision 12776)
+++ none/tests/amd64/pcmpstr64.c	(working copy)
@@ -639,6 +639,11 @@
    try_istri(wot,h,s, "1111111111111234", "1111111111111234"); 
    try_istri(wot,h,s, "a111111111111111", "000000000000000a"); 
    try_istri(wot,h,s, "b111111111111111", "000000000000000a"); 
+
+   try_istri(wot,h,s, "b111111111111111", "0000000000000000"); 
+   try_istri(wot,h,s, "0000000000000000", "0000000000000000"); 
+   try_istri(wot,h,s, "123456789abcdef1", "0000000000000000"); 
+   try_istri(wot,h,s, "0000000000000000", "123456789abcdef1"); 
 }
 
 
Index: none/tests/amd64/pcmpstr64w.c
===================================================================
--- none/tests/amd64/pcmpstr64w.c	(revision 12776)
+++ none/tests/amd64/pcmpstr64w.c	(working copy)
@@ -638,6 +638,11 @@
    try_istri(wot,h,s, "1111111111111234", "1111111111111234");
    try_istri(wot,h,s, "0a11111111111111", "000000000000000a");
    try_istri(wot,h,s, "0b11111111111111", "000000000000000a");
+
+   try_istri(wot,h,s, "b111111111111111", "0000000000000000"); 
+   try_istri(wot,h,s, "0000000000000000", "0000000000000000"); 
+   try_istri(wot,h,s, "123456789abcdef1", "0000000000000000"); 
+   try_istri(wot,h,s, "0000000000000000", "123456789abcdef1"); 
 }
Comment 7 Julian Seward 2012-07-24 09:23:01 UTC
(In reply to comment #5)

Josef, thanks for chasing this:

> Patch, which makes the test case of this bug report work
> (needs also be changed for the _wide variant):

That unfortunately causes other test cases to fail.

The test programs (none/tests/amd64/pcmpstr64{,w}.c) can be used to
test the C logic without using Valgrind, because they contain a copy
of same code that is in guest_generic_x87.c and they compare the 
output against the real instruction.  Hence you can do this:

gcc -Wall -g -O -o foo none/tests/amd64/pcmpstr64.c && ./foo | grep "\!\!\!"

and it will show any lines where the C simulation is wrong.  With
the test case additions in the previous comment, I get:

sewardj@phoenix:~/VgTRUNK/trunk$ gcc -Wall -g -O -o foo none/tests/amd64/pcmpstr64.c && ./foo | grep "\!\!\!"
istri 0C  00abcde100bcde11 00000000000abcde -> 00c00010 00c10006 !!!!
istri 0C  0000000000000000 123456789abcdef1 -> 00400010 08410000 !!!!
Comment 8 Josef Weidendorfer 2012-07-24 15:39:20 UTC
Forget the previous patch.
Actually, it is enough to move the break condition checking the end of
the haystack to the end of the loop. This allows an empty needle to match
an empty haystack. Passes all tests, including the new ones from
comment 6.

--- a/priv/guest_generic_x87.c
+++ b/priv/guest_generic_x87.c
@@ -895,9 +895,6 @@ Bool compute_PCMPxSTRx ( /*OUT*/V128* resV,
       UInt   validL  = ~(zmaskL | -zmaskL);  // not(left(zmaskL))
       UInt   validR  = ~(zmaskR | -zmaskR);  // not(left(zmaskR))
       for (hi = 0; hi < 16; hi++) {
-         if ((validL & (1 << hi)) == 0)
-            // run off the end of the haystack
-            break;
          UInt m = 1;
          for (ni = 0; ni < 16; ni++) {
             if ((validR & (1 << ni)) == 0) break;
@@ -906,6 +903,9 @@ Bool compute_PCMPxSTRx ( /*OUT*/V128* resV,
             if (argL[i] != argR[ni]) { m = 0; break; }
          }
          boolRes |= (m << hi);
+         if ((validL & (1 << hi)) == 0)
+            // run off the end of the haystack
+            break;
       }
Comment 9 Josef Weidendorfer 2012-07-24 16:23:53 UTC
Created attachment 72727 [details]
Patch for VEX
Comment 10 Josef Weidendorfer 2012-07-24 16:24:39 UTC
Created attachment 72728 [details]
Patch for VG (test case)
Comment 11 Julian Seward 2012-07-25 09:23:17 UTC
> Created attachment 72727 [details]
> Patch for VEX
> Created attachment 72728 [details]
> Patch for VG (test case)

Looks fine, please commit!
Comment 12 Josef Weidendorfer 2012-07-25 09:51:20 UTC
Fixed in VEX r2447
Comment 13 Moritz Hassert 2012-07-27 10:03:58 UTC
Hi Josef and Julian. Thanks for fixing the bug so fast!
As far as I understand the patches are for now only in cvs head? I'd really love to get this fixed in the debian/ubuntu package as fast as possible. Do you plan on releasing a patched version of valgrind-3.7.0 in the near future? Or should I contact the debian/ubuntu package maintainers to include your attached patches? In the latter case, do you see any problems applying them to the 3.7.0 release?
Comment 14 Josef Weidendorfer 2012-07-27 10:14:32 UTC
I do not know about plans for 3.7.1.
I would simply take the sources, patch them with the fix attached to this bug report, and compile it yourself. Or is there a reason to care about other Ubuntu users?
Comment 15 Moritz Hassert 2012-07-27 16:49:17 UTC
Yes, compiling myself was of course the quick solution. I patched and built the valgrind package from Ubuntu locally and it works like a charm. Thanks again! Saves me lots of trouble.

There's no particular reason to care about other Ubuntu users. It's not really an security issue or anything like that. But I do care about myself and all the machines I would have to install my custom deb package.

I adapted your "Patch for VEX" patch for version 3.7.0. It's attached to my corresponding Ubuntu bug report:
https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1027977