Bug 463859

Summary: valgrind hangs and is totally unresponsive running a very "simple" Rust program that uses glommio
Product: [Developer tools] valgrind Reporter: vlovich
Component: generalAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal CC: mark, pjfloyd
Priority: NOR    
Version: 3.19.0   
Target Milestone: ---   
Platform: Arch Linux   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Repro test case

Description vlovich 2023-01-05 04:02:51 UTC
Created attachment 155045 [details]
Repro test case

SUMMARY
I have a very "simple" program that uses the ChannelMesh functionality of glommio. When run under valgrind, the test code 

STEPS TO REPRODUCE
1. Install rust
2. Decompress the file (it has a nested folder of the same name)
3. cargo build
4. valgrind target/debug/valgrind-deadlock

OBSERVED RESULT
Valgrind hangs pretty quickly and is totally unresponsive - ctrl-c doesn't work and I have to `pkill -9 -f valgrind`.
```
valgrind --fair-sched=yes target/debug/valgrind-deadlock
==852776== Memcheck, a memory error detector
==852776== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==852776== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==852776== Command: target/debug/valgrind-deadlock
==852776== 
Started thread 2
Started thread 1
^C^C^ZKilled
```

EXPECTED RESULT
The program completes successfully and valgrind is responsive (i.e. ctrl-c will work).

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 5.26.5
KDE Frameworks Version: 5.101.0
Qt Version: 5.15.7

ADDITIONAL INFORMATION
Kernel: 6.0.8-arch1-1 (64-bit)
Processors: 12 x Intel i7-10750H CPU @ 2.6 GHz
Memory: 31.1 GiB of RAM

I've also tried with `--fair-sched=yes` and that has no effect. It looks like a thread is never getting started and thus the main thread is stuck waiting for it to start. We've not been able to find any documentation online that might explain this behavior / what's happening.
Comment 1 vlovich 2023-01-05 04:03:08 UTC
glommio bug tracking this: https://github.com/DataDog/glommio/issues/582
Comment 2 vlovich 2023-01-05 04:08:00 UTC
Also what may be relevant about this framework is that each thread has a separate io_uring poll loop and that's how all the IPC socket comms between threads are happening. So I wonder if it's something to do with valgrind / io_uring interop somehow?
Comment 3 vlovich 2023-01-05 04:15:57 UTC
Also repros on valgrind-git package (valgrind-git-3.20.0.r9.g0811a612d-1) which when run prints Valgrind-3.21.0.GIT
Comment 4 Mark Wielaard 2023-11-08 10:59:24 UTC
See also Bug #439226 Multiple io_urings per thread not supported
and Bug #428364 Signals inside io_uring_enter not handled
Comment 5 Paul Floyd 2023-11-18 12:17:39 UTC
The patch from https://bugs.kde.org/show_bug.cgi?id=428364 seems to also fix this.
When I try to reproduce a single ctrl-c is enough to interrupt the process.