Bug 484480 - False positives when using sem_trywait
Summary: False positives when using sem_trywait
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: helgrind (show other bugs)
Version: 3.22.0
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: Paul Floyd
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-03-25 21:16 UTC by jmcgee01125
Modified: 2024-03-30 17:10 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jmcgee01125 2024-03-25 21:16:29 UTC
SUMMARY
Using sem_trywait leads to false positives, even when it is guaranteed to have succeeded by the time that the access is made.

STEPS TO REPRODUCE
Compile the following code with `gcc -o test test.c -pthread`, then run with `valgrind --tool=helgrind ./test`

```
#include <stdio.h>
#include <semaphore.h>
#include <pthread.h>

static char* result;
static sem_t sem;

static void* func(void* data)
{
    result = "Finished!";
    sem_post(&sem);
    return NULL;
}

int main()
{
    sem_init(&sem, 0, 0);

    pthread_t tid;
    pthread_create(&tid, NULL, func, NULL);

    while (sem_trywait(&sem))
        ; // do nothing but keep retrying

    /* The above loop could be replaced this instead:
     * if (sem_trywait(&sem))
     *    sem_wait(&sem);
     */

    printf("%s\n", result);
}
```

OBSERVED RESULT
Helgrind reports the following:

==787273== Helgrind, a thread error detector
==787273== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al.
==787273== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==787273== Command: ./test
==787273==
==787273== ---Thread-Announcement------------------------------------------
==787273==
==787273== Thread #1 is the program's root thread
==787273==
==787273== ---Thread-Announcement------------------------------------------
==787273==
==787273== Thread #2 was created
==787273==    at 0x4980F23: clone (in /usr/lib64/libc.so.6)
==787273==    by 0x4981CDE: __clone_internal (in /usr/lib64/libc.so.6)
==787273==    by 0x48FC834: create_thread (in /usr/lib64/libc.so.6)
==787273==    by 0x48FD339: pthread_create@@GLIBC_2.34 (in /usr/lib64/libc.so.6)
==787273==    by 0x4852D1C: pthread_create_WRK (hg_intercepts.c:445)
==787273==    by 0x4854239: pthread_create@* (hg_intercepts.c:478)
==787273==    by 0x4011C4: main (in /home/ugrads/majors/colinm25/test)
==787273==
==787273== ----------------------------------------------------------------
==787273==
==787273== Possible data race during read of size 8 at 0x404080 by thread #1
==787273== Locks held: none
==787273==    at 0x4011D4: main (in /home/ugrads/majors/colinm25/test)
==787273==
==787273== This conflicts with a previous write of size 8 by thread #2
==787273== Locks held: none
==787273==    at 0x401172: func (in /home/ugrads/majors/colinm25/test)
==787273==    by 0x4852F16: mythread_wrapper (hg_intercepts.c:406)
==787273==    by 0x48FCC01: start_thread (in /usr/lib64/libc.so.6)
==787273==    by 0x4980F33: clone (in /usr/lib64/libc.so.6)
==787273==  Address 0x404080 is 0 bytes inside data symbol "result"
==787273==
Finished!
==787273==
==787273== Use --history-level=approx or =none to gain increased speed, at
==787273== the cost of reduced accuracy of conflicting-access information
==787273== For lists of detected and suppressed errors, rerun with: -s
==787273== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

EXPECTED RESULT
Helgrind should not report any warnings.

SOFTWARE/OS VERSIONS
Tested on CentOS Stream 9 with Valgrind 3.22.0 and Ubuntu Server 20.04 with Valgrind 3.15.0.
Comment 1 Paul Floyd 2024-03-26 16:43:30 UTC
If I add a pthread_join then I get no error.
Comment 2 jmcgee01125 2024-03-26 17:05:49 UTC
pthread_join only silences the warning if done before reading the shared state. In that case, the semaphore becomes useless (since pthread_join becomes the synchronization point). If joined after printing, like when synchronizing between two long-running threads, then this warning appears.
Comment 3 Paul Floyd 2024-03-29 20:35:24 UTC
Comment in hg_intercepts.c

   Unhandled:
     int sem_trywait(sem_t *sem);
     int sem_timedwait(sem_t *restrict sem,
                       const struct timespec *restrict abs_timeout);

I looked at DRD, it just does the same thing as sem_wait so I tried this code on FreeBSD

__attribute__((noinline))
static int sem_trywait_WRK(sem_t* sem)
{
   OrigFn fn;
   int    ret;
   VALGRIND_GET_ORIG_FN(fn);

   if (TRACE_SEM_FNS) {
      fprintf(stderr, "<< sem_trywait(%p) ", sem);
      fflush(stderr);
   }

   DO_CREQ_v_W(_VG_USERREQ__HG_POSIX_SEM_WAIT_PRE, sem_t*,sem);

   CALL_FN_W_W(ret, fn, sem);

   DO_CREQ_v_WW(_VG_USERREQ__HG_POSIX_SEM_WAIT_POST, sem_t*,sem,
                long, (ret == 0) ? True : False);

   if (ret != 0) {
      DO_PthAPIerror( "sem_trywait", SEM_ERROR );
   }

   if (TRACE_SEM_FNS) {
      fprintf(stderr, " sem_trywait -> %d >>\n", ret);
      fflush(stderr);
   }

   return ret;
}

#if defined(VGO_freebsd)
   LIBC_FUNC(int, semZutrywait, sem_t* sem) { /* sem_trywait */
      return sem_trywait_WRK(sem);
   }
#endif

and it seems to work.

I'll clean the code up (and add sem_timedwait) sometime this weekend.
Comment 4 Paul Floyd 2024-03-30 17:10:45 UTC
commit 485cea48a38db8db608bdaff0c695ff50bbe16b5 (HEAD -> master, origin/users/paulf/try-bug484480, origin/master, origin/HEAD, bug484480)
Author: Paul Floyd <pjfloyd@wanadoo.fr>
Date:   Sat Mar 30 16:31:12 2024 +0100

    Bug 484480 - False positives when using sem_trywait