Bug 338633

Summary: gdbserver_tests/nlcontrolc.vgtest hangs on arm64
Product: [Developer tools] valgrind Reporter: Mark Wielaard <mark>
Component: generalAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal CC: ivosh, philippe.waroquiers
Priority: NOR    
Version First Reported In: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description Mark Wielaard 2014-08-28 18:43:39 UTC
The gdbserver_tests/nlcontrolc.vgtest hangs on arm64 linux kernel setups for unknown reasons. The test is supposed to test that we can set a timeout in a select syscall to zero to make all threads fall out of their select loop. This works on other arches, but not on arm64. Only one (if any) thread falls out their select loop causing the testcase to newer finish.
Comment 1 Mark Wielaard 2014-08-28 18:50:03 UTC
valgrind svn r14376 disables this test on arm64 for now:

diff --git a/gdbserver_tests/nlcontrolc.vgtest b/gdbserver_tests/nlcontrolc.vgtest
index 8ff8355..7cb9c2f 100644
--- a/gdbserver_tests/nlcontrolc.vgtest
+++ b/gdbserver_tests/nlcontrolc.vgtest
@@ -10,7 +10,8 @@ prog: sleepers
 args: 1000000000 1000000000 1000000000 BSBSBSBS
 vgopts: --tool=none --vgdb=yes --vgdb-error=0 --vgdb-prefix=./vgdb-prefix-nlcontrolc
 stderr_filter: filter_stderr
-prereq: test -e gdb -a -f vgdb.invoker
+# Bug 338633 nlcontrol hangs on arm64 currently.
+prereq: test -e gdb -a -f vgdb.invoker && ! ../tests/arch_test arm64
 progB: gdb
 argsB: --quiet -l 60 --nx ./sleepers
 stdinB: nlcontrolc.stdinB.gdb
Comment 2 Mark Wielaard 2014-08-28 18:54:35 UTC
Note that the same issue happens "natively". Running sleepers and then attaching to it with gdb and running the commands from nlcontrolc.stdinB.gdb by hand also doesn't work. So it isn't specific to the valgrind vgdb bridge.
Comment 3 Philippe Waroquiers 2014-08-28 19:07:41 UTC
* It is also not clear why changing the select timeout arg value works
  on other archs: it looks like the glibc code takes a copy of the arg
  to pass to the kernel.
  So, it is unclear why changing the timeout value in the 'user variable'
  causes the select syscall to go out immediately when the syscall is restarted
  by the kernel, when gdb continues the program
Comment 4 Julian Seward 2014-08-30 10:05:58 UTC
I guess we should leave this open, until such time as it's fixed.  Yes?
Comment 5 Mark Wielaard 2014-08-30 10:34:38 UTC
Yes, please leave this open. I filed it so we don't forget even though we disabled the testcase for it (because the test would hang the whole testsuite).

Philippe had a theory that this could be cause because arm64-linux doesn't provide the traditional select system call, just the pselect system call. Which might have slightly different semantics when interrupted.

See also this presentation which lists some of the "modernization" of the syscall interface for amd64-linux:
http://people.linaro.org/~rikuvoipio/aarch64-talk/#/18
Comment 6 Philippe Waroquiers 2021-03-08 19:23:51 UTC
Fixed in c79180a3