Bug 76845 - vg_scheduler.c:786 (idle): Assertion `delta >= 0' failed.
Summary: vg_scheduler.c:786 (idle): Assertion `delta >= 0' failed.
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: unspecified
Platform: Compiled Sources Linux
: NOR crash
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-03-05 20:20 UTC by Peter Seiderer
Modified: 2004-11-25 09:32 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Test for timeout wraparound (533 bytes, text/plain)
2004-03-07 22:29 UTC, Peter Seiderer
Details
Suggested patch against timeout wraparound (1.92 KB, patch)
2004-03-07 23:13 UTC, Peter Seiderer
Details
New suggested patch for coregrind/vg_scheduler.c (1.29 KB, patch)
2004-03-08 18:06 UTC, Peter Seiderer
Details
Update: Suggested patch against timeout wraparound (1.91 KB, patch)
2004-04-13 17:33 UTC, Peter Seiderer
Details
Updated patch against valgrind-2.2.0 (3.21 KB, patch)
2004-09-27 21:08 UTC, Peter Seiderer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Seiderer 2004-03-05 20:20:43 UTC
Version:           valgrind-2.1.0 CVS (using KDE Devel)
Installed from:    Compiled sources
Compiler:          gcc (GCC) 3.2 
OS:          Linux

The following short test programm failed with
valgrind-2.1.0 and CVS HEAD checkout:

#include <pthread.h>
#include <limits.h>

int main(int argc, char *argv[]) {
  struct timespec timeout;
  pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER;
  pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
  pthread_mutex_lock(&mut);
  /* wait forever */
  timeout.tv_sec = INT_MAX;
  timeout.tv_nsec = 0;
  pthread_cond_timedwait(&cond, &mut, &timeout);
  pthread_mutex_unlock(&mut);
  return 0;
}

gcc -Wall test1.c -pthread
valgrind --tool=corecheck ./a.out
==7390== Coregrind, a rudimentary error detector for x86-linux.
==7390== Copyright (C) 2002-2004, and GNU GPL'd, by Nicholas Nethercote.
==7390== Using valgrind-2.1.0, a program supervision framework for x86-linux.
==7390== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward.
==7390== For more details, rerun with: -v
==7390== 

valgrind: vg_scheduler.c:786 (idle): Assertion `delta >= 0' failed.
==7390==    at 0xB802F21C: vgPlain_skin_assert_fail (vg_mylibc.c:1211)
==7390==    by 0xB802F21B: assert_fail (vg_mylibc.c:1207)
==7390==    by 0xB802F289: vgPlain_core_assert_fail (vg_mylibc.c:1218)
==7390==    by 0xB800E0C6: idle (vg_scheduler.c:793)

sched status:

Thread 1: status = WaitCV, associated_mx = 0xABFFE324, associated_cv = 0xABFFE31
4
==7390==    at 0x810227FA: pthread_cond_timedwait (vg_libpthread.c:1361)
==7390==    by 0x80485B0: main (in /home/seiderer/valgrind/test/a.out)


The following patch removes the signed/unsigned
clash in case of huge pthread_cond_timedwait
timeout values:

Index: coregrind/vg_scheduler.c
===================================================================
RCS file: /home/kde/valgrind/coregrind/vg_scheduler.c,v
retrieving revision 1.142
diff -u -3 -p -r1.142 vg_scheduler.c
--- coregrind/vg_scheduler.c    28 Feb 2004 15:40:36 -0000      1.142
+++ coregrind/vg_scheduler.c    5 Mar 2004 19:17:33 -0000
@@ -759,7 +759,7 @@ static
 void idle ( void )
 {
    struct vki_pollfd pollfd[1];
-   Int delta = -1;
+   UInt delta = (UInt)-1;
    Int fd = VG_(proxy_resfd)();
 
    pollfd[0].fd = fd;
@@ -782,19 +782,19 @@ void idle ( void )
       }
 
       if (tp != NULL) {
+         vg_assert(tp->time >= now);
         delta = tp->time - now;
-        vg_assert(delta >= 0);
       }
       if (wicked)
         delta = 0;
    }
 
    /* gotta wake up for something! */
-   vg_assert(fd != -1 || delta != -1);
+   vg_assert(fd != -1 || delta != (UInt)-1);
 
    /* If we need to do signal routing, then poll for pending signals
       every VG_(clo_signal_polltime) mS */
-   if (VG_(do_signal_routing) && (delta > VG_(clo_signal_polltime) || delta == -1))
+   if (VG_(do_signal_routing) && (delta > VG_(clo_signal_polltime) || delta == (UInt)-1))
       delta = VG_(clo_signal_polltime);
    
    if (VG_(clo_trace_sched)) {
Comment 1 Jeremy Fitzhardinge 2004-03-06 01:13:29 UTC
This trouble with this is that poll() takes a signed integer timeout, and considers anything <0 to be infinite.  Is the problem that delta becomes negative, or that we should map all negative values to -1?
Comment 2 Peter Seiderer 2004-03-07 22:27:51 UTC
With 'tp->time' and 'now' defined as UInt and delta as Int the above
calculation 'delta = tp->time - now;' will produce false results if
'tp->time - now' exceeds INT_MAX. The suggested patch tries to avoid this by
changing the definition of delta from Int to UInt (and move the assertion
a little bit upwards).

If delta must be kept as Int the result of 'tp->time - now' must be checked
and/or limited not to exceed INT_MAX.

But the real (or a further) problem seems to be the handling of huge timeouts
given to pthread_cond_timedwait, which wrap around due to the limited
range of the internal time representation in 1/1000 seconds:

pthread_cond_timedwait with a timeout of now + 5 seconds calls
add_timeout with time = 5000

pthread_cond_timedwait with timeout of now + 4294968 seconds (49 days) calls
add_timeout with time = 704

See attached test2.c program:
gcc -Wall test2.c
time valgrind --tool=corecheck ./a.out 5
 real    0m5.196s
time valgrind --tool=corecheck ./a.out 4294968
 real    0m0.898s
Comment 3 Peter Seiderer 2004-03-07 22:29:39 UTC
Created attachment 5070 [details]
Test for timeout wraparound
Comment 4 Peter Seiderer 2004-03-07 23:13:38 UTC
Created attachment 5071 [details]
Suggested patch against timeout wraparound

Suggested patch to avoid timeout wraparound from pthread_cond_timedwait
(plus fix of wrong convertion from tv_usec to ms).
Comment 5 Peter Seiderer 2004-03-08 18:06:32 UTC
Created attachment 5086 [details]
New suggested patch for coregrind/vg_scheduler.c

Sorry, did not see the need for a signed delta because of the poll call while
doing the first suggested patch.

Here now a new, less invasive patch,limiting delta to INT_MAX and moving
the 'wake up' assert some lines down to give a overwrite
by clo_signal_polltime a chance.
Comment 6 Jeremy Fitzhardinge 2004-03-13 03:55:52 UTC
Looks good.  I've give it a closer look and probably apply it.
Comment 7 Peter Seiderer 2004-04-13 17:33:45 UTC
Created attachment 5620 [details]
Update: Suggested patch against timeout wraparound 

Fix bug in previous patch, the line
- ull_ms_now = 1000ULL * ((unsigned long long int)(ms_now));
should read
+ ull_ms_now = ((unsigned long long int)(ms_now));

Peter
Comment 8 Peter Seiderer 2004-09-27 21:08:56 UTC
Created attachment 7695 [details]
Updated patch against valgrind-2.2.0

Updated patch against valgrind-2.2.0 (both previous patches in one).
- Fix calculation of delta in vg_scheduler.c (the assertion).
- Fix timestamp calculation of for pthread_cond_timedwait in
  vg_libpthread.c (short 'real' timeouts for long given timeouts).
Comment 9 Tom Hughes 2004-10-17 17:19:16 UTC
This patch has now been committed, minus the piece that moved the assertion as I don't believe that to be correct or necessary. We should have something to wake up for regardless of whether we are having to route signals or not.
Comment 10 David Camp 2004-11-06 00:33:14 UTC
Hi Tom Hughes,

I was looking at your patch for the head branch for this bug.
 
Committed by Tom Hughes on 2004/10/17, 15:18:22 in branch HEAD
http://cvs-digest.org/index.php?diff&file=valgrind/coregrind/vg_scheduler.c,v&version=1.191

And there is a problem with valgrind/coregrind/vg_libpthread.c fix at "Line 1375 int __pthread_mutex_timedlock(pthread_mu". You need to set ull_ms_now and 
ull_ms_end before the if statement. Like you did at Line 1548.

So you just need to add the following.
ull_ms_now = ((unsigned long long int)(ms_now)); 
ull_ms_end = ull_ms_now + (ull_ms_end_after_1970 - ull_ms_now_after_1970); 


Thanks for the fix. It works great for me.

David
Comment 11 Tom Hughes 2004-11-25 09:32:28 UTC
Those lines were put in the next day when the overnight builds failed.