Bug 445607 - Unhandled amd64-freebsd syscall: 247
Summary: Unhandled amd64-freebsd syscall: 247
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: unspecified
Platform: FreeBSD Ports FreeBSD
: NOR normal
Target Milestone: ---
Assignee: Paul Floyd
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-16 19:21 UTC by serpent7776
Modified: 2021-11-19 20:45 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
swi-pl crash log when running under valgrind (3.28 KB, text/x-log)
2021-11-16 19:21 UTC, serpent7776
Details
Russing swi-pl under valgrind (with debug symbols) (5.22 KB, text/x-log)
2021-11-18 20:07 UTC, serpent7776
Details
Running swipl under valgrind with track-origins enabled (7.68 KB, text/x-log)
2021-11-19 16:24 UTC, serpent7776
Details
Running swipl under valgrind with debug tcmalloc (8.09 KB, text/x-log)
2021-11-19 20:19 UTC, serpent7776
Details

Note You need to log in before you can comment on or make changes to this bug.
Description serpent7776 2021-11-16 19:21:39 UTC
Created attachment 143640 [details]
swi-pl crash log when running under valgrind

SUMMARY
Running swi prolog under FreeBSD 12.2 results in unhandled syscall 247 warning followed by a swi-pl crash.
Looking at /usr/src/sys/sys/syscall.h it looks like syscall 247 is SYS_clock_getcpuclockid2.


STEPS TO REPRODUCE
1. run `valgring swipl` on FreeBSD

OBSERVED RESULT
A lot of warnings "unhandled amd64-freebsd syscall: 247" printed, followed by swi-pl segfault (attached log file).

EXPECTED RESULT
No warnings, swi-pl should work normally under valgrind.

SOFTWARE/OS VERSIONS
FreeBSD DaemONX 12.2-RELEASE-p7 FreeBSD 12.2-RELEASE-p7 GENERIC  amd64
valgrind-3.18.1
SWI-Prolog version 8.2.3 for amd64-freebsd

ADDITIONAL INFORMATION
Comment 1 Paul Floyd 2021-11-17 09:25:34 UTC
Looking at syscalls.master the missing syscall wrapper seems fairly straightforward

247	AUE_NULL	STD {
		int clock_getcpuclockid2(
		    id_t id,
		    int which,
		    _Out_ clockid_t *clock_id
		);
	}

The crash may be unrelated.
Comment 2 Paul Floyd 2021-11-17 13:13:12 UTC
I've done a super quick implementation of clock_getcpuclockid2 [it probably needs a separate i386 version because the 1st param is 64bits].

I now get

paulf> ./vg-in-place --soname-synonyms=somalloc=libtcmalloc_minimal.so.4 swipl
==4514== Memcheck, a memory error detector
==4514== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4514== Using Valgrind-3.19.0.GIT and LibVEX; rerun with -h for copyright info
==4514== Command: swipl
==4514== 
#Welcome to SWI-Prolog (threaded, 64 bits, version 8.2.3)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

==4514== brk segment overflow in thread #2: can't grow to 0x4820000
==4514== (see section Limitations in user manual)
==4514== NOTE: further instances of this message will not be shown
==4514== Thread 2 gc:
==4514== Invalid write of size 8
==4514==    at 0x4877C05: tcmalloc::ThreadCache::Init(pthread*) (in /usr/local/lib/libtcmalloc_minimal.so.4.5.9)
==4514==    by 0x4878C4D: tcmalloc::ThreadCache::NewHeap(pthread*) (in /usr/local/lib/libtcmalloc_minimal.so.4.5.9)
==4514==    by 0x4878A74: tcmalloc::ThreadCache::CreateCacheIfNecessary() (in /usr/local/lib/libtcmalloc_minimal.so.4.5.9)
==4514==    by 0x486E115: TCMallocImplementation::MarkThreadBusy() (in /usr/local/lib/libtcmalloc_minimal.so.4.5.9)
==4514==    by 0x4AA26F4: ??? (in /usr/local/lib/swipl/lib/amd64-freebsd/libswipl.so.8.2.3)
==4514==    by 0x4AB8EC2: PL_next_solution (in /usr/local/lib/swipl/lib/amd64-freebsd/libswipl.so.8.2.3)
==4514==    by 0x4AD08DB: PL_call_predicate (in /usr/local/lib/swipl/lib/amd64-freebsd/libswipl.so.8.2.3)
==4514==    by 0x4B5C7BC: ??? (in /usr/local/lib/swipl/lib/amd64-freebsd/libswipl.so.8.2.3)
==4514==    by 0x529082A: ??? (in /lib/libthr.so.3)
==4514==  Address 0x4021000 is not stack'd, malloc'd or (recently) free'd
==4514== 
==4514== 
==4514== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==4514==  Access not within mapped region at address 0x4021000
==4514==    at 0x4877C05: tcmalloc::ThreadCache::Init(pthread*) (in /usr/local/lib/libtcmalloc_minimal.so.4.5.9)
==4514==    by 0x4878C4D: tcmalloc::ThreadCache::NewHeap(pthread*) (in /usr/local/lib/libtcmalloc_minimal.so.4.5.9)
==4514==    by 0x4878A74: tcmalloc::ThreadCache::CreateCacheIfNecessary() (in /usr/local/lib/libtcmalloc_minimal.so.4.5.9)
==4514==    by 0x486E115: TCMallocImplementation::MarkThreadBusy() (in /usr/local/lib/libtcmalloc_minimal.so.4.5.9)
==4514==    by 0x4AA26F4: ??? (in /usr/local/lib/swipl/lib/amd64-freebsd/libswipl.so.8.2.3)
==4514==    by 0x4AB8EC2: PL_next_solution (in /usr/local/lib/swipl/lib/amd64-freebsd/libswipl.so.8.2.3)
==4514==    by 0x4AD08DB: PL_call_predicate (in /usr/local/lib/swipl/lib/amd64-freebsd/libswipl.so.8.2.3)
==4514==    by 0x4B5C7BC: ??? (in /usr/local/lib/swipl/lib/amd64-freebsd/libswipl.so.8.2.3)
==4514==    by 0x529082A: ??? (in /lib/libthr.so.3)
==4514==  If you believe this happened as a result of a stack
==4514==  overflow in your program's main thread (unlikely but
==4514==  possible), you can try to increase the size of the
==4514==  main thread stack using the --main-stacksize= flag.
==4514==  The main thread stack size used in this run was 16777216.
Comment 3 Paul Floyd 2021-11-17 21:06:29 UTC
If I increase the brk segment from 8M to 1G that part of the message goes away, but the other parts remain.

The diff to do that is

diff --git a/coregrind/m_initimg/initimg-freebsd.c b/coregrind/m_initimg/initimg-freebsd.c
index d19186a42..59c2f4f85 100644
--- a/coregrind/m_initimg/initimg-freebsd.c
+++ b/coregrind/m_initimg/initimg-freebsd.c
@@ -891,7 +891,7 @@ IIFinaliseImageInfo VG_(ii_create_image)( IICreateImageInfo iicii,
    //--------------------------------------------------------------
    {
       SizeT m1 = 1024 * 1024;
-      SizeT m8 = 8 * m1;
+      SizeT m8 = 8 * m1 * 128;
       SizeT dseg_max_size = (SizeT)VG_(client_rlimit_data).rlim_cur;
       VG_(debugLog)(1, "initimg", "Setup client data (brk) segment\n");
       if (dseg_max_size < m1) dseg_max_size = m1;

Note that questions about this arise frequently. This has to be hard coded as it is really early in the tool startup, and it is not yet possible to parse command line arguments or read environment variables.
Comment 4 Paul Floyd 2021-11-17 22:33:35 UTC
I've pushed the change to resolve the missing syscall wrapper.

Can you clone and build Valgrind (with the diff in my previous message) and also build debug versions of swi-pl and jemallloc?

I'll also try to get in touch with the swi-pl maintainer. I'll leave this item open for the moment.
Comment 5 serpent7776 2021-11-18 19:27:46 UTC
OK, but how can I build debug version of jemallloc and then use it? As I understand it, it is FreeBSD's malloc? Do I run make in /usr/src/lib/libc/stdlib/jemalloc?
Comment 6 Paul Floyd 2021-11-18 19:36:51 UTC
My mistake   /usr/local/lib/libtcmalloc_minimal.so.4.5.9 is google perftools tcmalloc https://www.freshports.org/devel/google-perftools/

Try getting a debug build of swi-pl first. If there is a problem it is more likely to be there than in the memory allocator.
Comment 7 serpent7776 2021-11-18 20:07:05 UTC
Created attachment 143711 [details]
Russing swi-pl under valgrind (with debug symbols)
Comment 8 serpent7776 2021-11-18 20:09:25 UTC
I did a build in a jail to not mess up my system, hope this isn't an issue.
I built valgrind at f13667b1eff8d3d06590683b9981ced611bd3c69 + brk change and debug build of swi-pl 8.2.3.
I've attached log from running swi-pl under valgrind.
Comment 9 Paul Floyd 2021-11-18 21:18:12 UTC
Does --track-origins=yes show anything useful?

What is happening here pl_thread_idle2_va (src/pl-alloc.c:1899) ?

==59737== Thread 2 gc:
==59737== Conditional jump or move depends on uninitialised value(s)
==59737==    at 0x4077122: TCMallocImplementation::MarkThreadBusy() (in /usr/local/lib/libtcmalloc_minimal.so.4.5.9)
==59737==    by 0x42B68D2: pl_thread_idle2_va (src/pl-alloc.c:1899)
==59737==    by 0x42C86D9: PL_next_solution (src/pl-vmi.c:3839)
==59737==    by 0x42E0AD1: PL_call_predicate (src/pl-fli.c:4145)
==59737==    by 0x439B2C7: GCmain (src/pl-thread.c:5527)
==59737==    by 0x4AAAFAB: ??? (in /basejail/lib/libthr.so.3)

From what I can see online the swi-pl code is doing this

In initTCMalloc

  fMallocExtension_MarkThreadBusy =
    PL_dlsym(NULL, "MallocExtension_MarkThreadBusy");


In PRED_IMPL("thread_idle", 2, thread_idle, PL_FA_TRANSPARENT)

  if ( fMallocExtension_MarkThreadBusy )
    fMallocExtension_MarkThreadBusy();

In tcmalloc

class TCMallocImplementation : public MallocExtension {

  virtual void MarkThreadBusy();  // Implemented below
...

void TCMallocImplementation::MarkThreadBusy() {
  // Allocate to force the creation of a thread cache, but avoid
  // invoking any hooks.
  do_free(do_malloc(0));
}

Calling a C++ virtual function through a C pointer to function isn't safe. In this case the virtual function doesn't seem to use 'this'. do_free and do_malloc are inlined so it's hard to see exactly what is going on in MarkThreadBusy

But I wonder if tcmalloc needs some global initialization before these calls. Not sure how to debug that.
Comment 10 Paul Floyd 2021-11-19 07:48:21 UTC
I'm closing this as the unhandled syscall has been added.

I'll continue to investigate the swi-pl problem when running under Valgrind and will open a new item if necessary.
Comment 11 serpent7776 2021-11-19 16:24:36 UTC
Created attachment 143736 [details]
Running swipl under valgrind with track-origins enabled
Comment 12 serpent7776 2021-11-19 16:25:24 UTC
I attached log with added --track-origins=yes
Comment 13 serpent7776 2021-11-19 20:19:11 UTC
Created attachment 143745 [details]
Running swipl under valgrind with debug tcmalloc
Comment 14 serpent7776 2021-11-19 20:20:26 UTC
I attached log with debug tcmalloc version.
Is using sbrk normal? Isn't this a legacy thing?
Comment 15 Paul Floyd 2021-11-19 20:45:38 UTC
Please could you move discusssion here: https://github.com/paulfloyd/freebsd_valgrind/issues/174