Bug 368529 - Android arm target link error, missing atexit and pthread_atfork
Summary: Android arm target link error, missing atexit and pthread_atfork
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.11.0
Platform: Android Other
: NOR grave
Target Milestone: ---
Assignee: Ivo Raisr
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-10 01:02 UTC by chh
Modified: 2017-06-25 15:09 UTC (History)
5 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
diff to m_main.c (1.38 KB, patch)
2017-04-19 16:53 UTC, chh
Details
patch for https://bugs.kde.org/show_bug.cgi?id=368529#c8 (1.37 KB, patch)
2017-05-15 20:33 UTC, Elliott Hughes
Details

Note You need to log in before you can comment on or make changes to this bug.
Description chh 2016-09-10 01:02:05 UTC
When creating a statically linked valgrind for Android arm target
and compiled by clang/llvm compiler, the atexit and pthread_atfork
functions are used indirectly but undefined in Android libc.a.
The link failed with undefined message like:

external/jemalloc/src/jemalloc.c:1343: error: undefined reference to 'pthread_atfork'
external/jemalloc/src/jemalloc.c:1279: error: undefined reference to 'atexit'


The following diff should fixed the problem for Android arm target:

diff --git a/coregrind/m_main.c b/coregrind/m_main.c
index 140efbf..634822e 100644
--- a/coregrind/m_main.c
+++ b/coregrind/m_main.c
@@ -3998,6 +3998,15 @@ UWord voucher_mach_msg_set ( UWord arg1 )
 #endif
 
 
+/*====================================================================*/
+/*=== Dummy atexit and pthread_atfork for Android clang/llvm, arm  ===*/
+/*====================================================================*/
+
+#if defined(ANDROID) && defined(__clang__) && defined(__arm__)
+__attribute__((weak)) void atexit() {}
+__attribute__((weak)) void pthread_atfork() {}
+#endif
+
 /*--------------------------------------------------------------------*/
 /*--- end                                                          ---*/
 /*--------------------------------------------------------------------*/




Reproducible: Always

Steps to Reproduce:
Build Android's external/valgrind project for aosp_arm-eng target.


Actual Results:  
link error:
external/jemalloc/src/jemalloc.c:1343: error: undefined reference to 'pthread_atfork'
external/jemalloc/src/jemalloc.c:1279: error: undefined reference to 'atexit'


Expected Results:  
successful link
Comment 1 chh 2016-09-29 18:07:16 UTC
The undefined symbols were due to the use of builtin functions
like __aeabi_memcpy* and __aeabi_memclr* from clang/llvm compiler.
Those symbols pulled in other modules in Android libc.a and then
require atexit and pthread_atfork, which are not defined in libc.a.

A better solution is to define those __aeabi_* functions in m_main.c,
as it already defines memcpy to VG_(memcpy) and memset to VG_(memset).
The following is the suggested diff.

diff --git a/coregrind/m_main.c b/coregrind/m_main.c
index 140efbf..66f04df 100644
--- a/coregrind/m_main.c
+++ b/coregrind/m_main.c
@@ -2883,6 +2883,46 @@ void __aeabi_unwind_cpp_pr1(void){
    VG_(printf)("Something called __aeabi_unwind_cpp_pr1()\n");
    vg_assert(0);
 }
+
+#if defined(ANDROID) && defined(__clang__)
+/* Replace __aeabi_memcpy* functions with vgPlain_memcpy. */
+void* __aeabi_memcpy(void *dest, const void *src, SizeT n);
+void* __aeabi_memcpy(void *dest, const void *src, SizeT n)
+{
+    return VG_(memcpy)(dest, src, n);
+}
+
+void* __aeabi_memcpy4(void *dest, const void *src, SizeT n);
+void* __aeabi_memcpy4(void *dest, const void *src, SizeT n)
+{
+    return VG_(memcpy)(dest, src, n);
+}
+
+void* __aeabi_memcpy8(void *dest, const void *src, SizeT n);
+void* __aeabi_memcpy8(void *dest, const void *src, SizeT n)
+{
+    return VG_(memcpy)(dest, src, n);
+}
+
+/* Replace __aeabi_memclr* functions with vgPlain_memset. */
+void* __aeabi_memclr(void *dest, SizeT n);
+void* __aeabi_memclr(void *dest, SizeT n)
+{
+    return VG_(memset)(dest, 0, n);
+}
+
+void* __aeabi_memclr4(void *dest, SizeT n);
+void* __aeabi_memclr4(void *dest, SizeT n)
+{
+    return VG_(memset)(dest, 0, n);
+}
+
+void* __aeabi_memclr8(void *dest, SizeT n);
+void* __aeabi_memclr8(void *dest, SizeT n)
+{
+    return VG_(memset)(dest, 0, n);
+}
+#endif /* ANDROID __clang__ */
 #endif
 
 /* ---------------- Requirement 2 ---------------- */
Comment 2 chh 2017-04-18 18:12:18 UTC
Julian, this was fixed for Android in
https://android-review.git.corp.google.com/#/c/272640
Could you review and take that patch?
Thanks.
Comment 3 Mark Wielaard 2017-04-19 12:04:55 UTC
(In reply to chh from comment #2)
> Julian, this was fixed for Android in
> https://android-review.git.corp.google.com/#/c/272640
> Could you review and take that patch?

Please always attach proposed patches to bugzilla.
That makes it much easier to review and apply them.

Pointing to some external project fork repository/review system makes it hard to know what is intended for the upstream project and what is specific to the downstream project.

Also the above URL seems to be to some internal google project.
Comment 4 chh 2017-04-19 16:53:24 UTC
Created attachment 105101 [details]
diff to m_main.c

This is from https://android-review.googlesource.com/#/c/272640/
Comment 5 Elliott Hughes 2017-05-12 23:03:08 UTC
this patch is still necessary with 3.12 (we updated AOSP recently). it's one of only four issues building valgrind 3.12 out of the box for Android. worth taking for 3.13?
Comment 6 Ivo Raisr 2017-05-13 05:39:59 UTC
I will have a look.
Comment 7 Ivo Raisr 2017-05-13 19:53:41 UTC
I am a little worried about this condition in the patch:

#if defined(ANDROID) && defined(__clang__)

Nowhere in Valgrind sources we currently base a decision on naked "ANDROID".
It is always this combo (from vki-linux.h):

#if defined(VGPV_arm_linux_android) || defined(VGPV_x86_linux_android) \
    || defined(VGPV_mips32_linux_android) \
    || defined(VGPV_arm64_linux_android)
    ...
#endif /* defined(VGPV_*_linux_android) */

If this patch is really arm-android specific, then it should fold inside existing
#if defined(VGP_arm_linux) [at line 2424]
...
#endif [at line 2442]

with a guard such as:

#if defined(VGPV_arm_linux_android)
...
#endif

No need to "defined(__clang__)".


Let me know what is the case here.
Please eventually modify the patch and test it on arm.
Comment 8 Elliott Hughes 2017-05-15 20:32:13 UTC
(In reply to Ivo Raisr from comment #7)
> I am a little worried about this condition in the patch:
> 
> #if defined(ANDROID) && defined(__clang__)
> 
> Nowhere in Valgrind sources we currently base a decision on naked "ANDROID".
> It is always this combo (from vki-linux.h):
> 
> #if defined(VGPV_arm_linux_android) || defined(VGPV_x86_linux_android) \
>     || defined(VGPV_mips32_linux_android) \
>     || defined(VGPV_arm64_linux_android)
>     ...
> #endif /* defined(VGPV_*_linux_android) */

that idiom is kind of insane in most cases. it also leads to bugs like https://bugs.kde.org/show_bug.cgi?id=339945 or https://bugs.kde.org/show_bug.cgi?id=379764 where you have to manually maintain these lists. or other bugs still to be discovered:

#  if defined(VGPV_arm_linux_android) \
      || defined(VGPV_x86_linux_android) \
      || defined(VGPV_mips32_linux_android) \
      || defined(VGPV_arm64_linux_android)
   const HChar*  default_interp_name = "/system/bin/sh";
#  else
   const HChar*  default_interp_name = "/bin/sh";
#  endif

is missing amd64, for example. just using __ANDROID__ would have avoided this.

ah, and looking at more of these i think i'm finding the causes of other outstanding valgrind bugs on x86-64 Android...

i think you generally want to use __BIONIC__ or __ANDROID__ instead, in the same way you use __GLIBC__ rather than individually listing each possible glibc/arch combination.

that said, getting back to this bug...

> If this patch is really arm-android specific,

...yes, this patch is arm-specific, but it's already inside #if defined(VGP_arm_linux).

> then it should fold inside
> existing
> #if defined(VGP_arm_linux) [at line 2424]
> ...
> #endif [at line 2442]
> 
> with a guard such as:
> 
> #if defined(VGPV_arm_linux_android)
> ...
> #endif
> 
> No need to "defined(__clang__)".
> 
> 
> Let me know what is the case here.
> Please eventually modify the patch and test it on arm.

i'll attach the smallest patch that works.
Comment 9 Elliott Hughes 2017-05-15 20:33:11 UTC
Created attachment 105569 [details]
patch for https://bugs.kde.org/show_bug.cgi?id=368529#c8
Comment 10 Elliott Hughes 2017-05-15 21:38:44 UTC
yeah, looking through the current clusters of #if..._android_linux, they're all either wrong or the subject of an outstanding patch :-/

i've raised https://bugs.kde.org/show_bug.cgi?id=379878 for that more general issue, to avoid derailing this bug.
Comment 11 Ivo Raisr 2017-05-16 08:51:15 UTC
Committed in Valgrind SVN as r16384.
Thank you for the patch.
Comment 12 Julian Seward 2017-05-16 09:10:37 UTC
(In reply to Elliott Hughes from comment #8)
> (In reply to Ivo Raisr from comment #7)
> > I am a little worried about this condition in the patch:
> > 
> > #if defined(ANDROID) && defined(__clang__)
> > 
> > Nowhere in Valgrind sources we currently base a decision on naked "ANDROID".
> > It is always this combo (from vki-linux.h):
> > 
> > #if defined(VGPV_arm_linux_android) || defined(VGPV_x86_linux_android) \
> >     || defined(VGPV_mips32_linux_android) \
> >     || defined(VGPV_arm64_linux_android)
> >     ...
> > #endif /* defined(VGPV_*_linux_android) */
> 
> that idiom is kind of insane in most cases. it also leads to bugs like
> https://bugs.kde.org/show_bug.cgi?id=339945 or
> https://bugs.kde.org/show_bug.cgi?id=379764 where you have to manually
> maintain these lists.

No, I object.  The platform conditionalisation is like it is for a good
reason, which is that ad-hoc ifdeffery, which this __ANDROID__ define is,
eventually leads to a big mess which is hard to modify and reason about.
We have to keep the system maintainable on a bunch of targets, not just
Android.

Please use the existing system where you can.  We want to continue to
support Android well, but that can't be at the expense of getting into
a confusing maze of ifdefs.
Comment 13 Julian Seward 2017-05-16 09:31:26 UTC
Redone, using the house conditionalisation scheme, in r16386.  Please test.
Comment 14 Elliott Hughes 2017-06-17 23:14:13 UTC
> Redone, using the house conditionalisation scheme, in r16386.  Please test.

(sorry, didn't have time to look until now.)

as i explained already, that's wrong.

anywhere you have this idiom in the code and it doesn't include *all* the Android variants -- which right now is at least 30 places, almost everywhere you have this idiom -- it's wrong.

why not just say __ANDROID__ for OS-details (like /system/bin/sh versus /bin/sh) or __BIONIC__ if it's a libc-specific thing?

you already make use of __GLIBC__ and __UCLIBC__ and [a home-grown] MUSL_LIBC, but won't use __BIONIC__? that doesn't make any sense.

(see comment 8 on this bug or https://bugs.kde.org/show_bug.cgi?id=379878 if you want more examples of bugs caused by this idiom.)
Comment 15 Elliott Hughes 2017-06-24 06:22:27 UTC
fwiw, here's the current diff to be able to build valgrind 3.13.0 for Android (arm,aarch64,x86-64). (x86 is useless because valgrind doesn't support SSE2 for x86 so you can't get as far as calling main, but these are the only fixes you need to support x86-64, so you might want to update your list of supported platforms in the release notes.)

https://android-review.googlesource.com/420323/

Only in /huge-ssd/aosp-arm64/external/valgrind/: android
Only in /huge-ssd/aosp-arm64/external/valgrind/: Android.build_all.mk
Only in /huge-ssd/aosp-arm64/external/valgrind/: Android.build_host.mk
Only in /huge-ssd/aosp-arm64/external/valgrind/: Android.build_one.mk
Only in /huge-ssd/aosp-arm64/external/valgrind/: Android.clean.mk
Only in /huge-ssd/aosp-arm64/external/valgrind/: Android.mk
Only in /huge-ssd/aosp-arm64/external/valgrind/: ANDROID_PATCH_AGAINST_UPSTREAM.txt
Only in /huge-ssd/aosp-arm64/external/valgrind/: Android.test.mk
diff '--exclude=.git' -ru valgrind-3.13.0/config.h /huge-ssd/aosp-arm64/external/valgrind/config.h
--- valgrind-3.13.0/config.h	2017-06-21 14:11:07.177545261 -0700
+++ /huge-ssd/aosp-arm64/external/valgrind/config.h	2017-06-21 14:07:44.786099941 -0700
@@ -45,10 +45,14 @@
 
 /* Define to 1 if index() and strlen() have been optimized heavily (x86 glibc
    >= 2.12) */
+#ifndef __ANDROID__
 #define GLIBC_MANDATORY_INDEX_AND_STRLEN_REDIRECT 1
+#endif
 
 /* Define to 1 if strlen() has been optimized heavily (amd64 glibc >= 2.10) */
+#ifndef __ANDROID__
 #define GLIBC_MANDATORY_STRLEN_REDIRECT 1
+#endif
 
 /* Define to 1 if you have the <asm/unistd.h> header file. */
 #define HAVE_ASM_UNISTD_H 1
@@ -86,13 +90,15 @@
 #define HAVE_CLOCK_MONOTONIC 1
 
 /* Define to 1 if you have a dlinfo that can do RTLD_DI_TLS_MODID. */
+#ifndef __ANDROID__
 #define HAVE_DLINFO_RTLD_DI_TLS_MODID 1
+#endif
 
 /* Define to 1 if the system has the type `Elf32_Chdr'. */
-/* #undef HAVE_ELF32_CHDR */
+//#define HAVE_ELF32_CHDR 1
 
 /* Define to 1 if the system has the type `Elf64_Chdr'. */
-/* #undef HAVE_ELF64_CHDR */
+//#define HAVE_ELF64_CHDR 1
 
 /* Define to 1 if you have the <endian.h> header file. */
 #define HAVE_ENDIAN_H 1
@@ -170,7 +176,9 @@
 /* #undef HAVE_PTHREAD_CREATE_GLIBC_2_0 */
 
 /* Define to 1 if you have the `PTHREAD_MUTEX_ADAPTIVE_NP' constant. */
+#ifndef __ANDROID__
 #define HAVE_PTHREAD_MUTEX_ADAPTIVE_NP 1
+#endif
 
 /* Define to 1 if you have the `PTHREAD_MUTEX_ERRORCHECK_NP' constant. */
 #define HAVE_PTHREAD_MUTEX_ERRORCHECK_NP 1
@@ -182,7 +190,9 @@
 #define HAVE_PTHREAD_MUTEX_TIMEDLOCK 1
 
 /* Define to 1 if pthread_mutex_t has a member __data.__kind. */
+#ifndef __ANDROID__
 #define HAVE_PTHREAD_MUTEX_T__DATA__KIND 1
+#endif
 
 /* Define to 1 if pthread_mutex_t has a member called __m_kind. */
 /* #undef HAVE_PTHREAD_MUTEX_T__M_KIND */
@@ -219,7 +229,9 @@
 #define HAVE_SEMTIMEDOP 1
 
 /* Define to 1 if libstd++ supports annotating shared pointers */
+#ifndef __ANDROID__
 #define HAVE_SHARED_POINTER_ANNOTATION 1
+#endif
 
 /* Define to 1 if you have the `signalfd' function. */
 #define HAVE_SIGNALFD 1
@@ -456,7 +468,11 @@
 #define VERSION "3.13.0"
 
 /* Temporary files directory */
+#ifdef __ANDROID__
+#define VG_TMPDIR "/data/local/tmp"
+#else
 #define VG_TMPDIR "/tmp"
+#endif
 
 /* Define to `int' if <sys/types.h> doesn't define. */
 /* #undef gid_t */
diff '--exclude=.git' -ru valgrind-3.13.0/coregrind/m_coredump/coredump-elf.c /huge-ssd/aosp-arm64/external/valgrind/coregrind/m_coredump/coredump-elf.c
--- valgrind-3.13.0/coregrind/m_coredump/coredump-elf.c	2017-05-31 08:14:48.000000000 -0700
+++ /huge-ssd/aosp-arm64/external/valgrind/coregrind/m_coredump/coredump-elf.c	2017-06-21 14:08:45.497933443 -0700
@@ -135,6 +135,7 @@
    phdr->p_align = VKI_PAGE_SIZE;
 }
 
+#if 0 /* We've had Elf32_Nhdr since at least froyo! */
 #if defined(VGPV_arm_linux_android) || defined(VGPV_x86_linux_android) \
     || defined(VGPV_mips32_linux_android)
 /* Android's libc doesn't provide a definition for this.  Hence: */
@@ -146,6 +147,7 @@
    }
    Elf32_Nhdr;
 #endif
+#endif
 
 struct note {
    struct note *next;
diff '--exclude=.git' -ru valgrind-3.13.0/coregrind/vgdb.c /huge-ssd/aosp-arm64/external/valgrind/coregrind/vgdb.c
--- valgrind-3.13.0/coregrind/vgdb.c	2017-05-31 08:14:29.000000000 -0700
+++ /huge-ssd/aosp-arm64/external/valgrind/coregrind/vgdb.c	2017-06-21 14:17:48.668450889 -0700
@@ -682,10 +682,7 @@
       sigpipe++;
    } else if (signum == SIGALRM) {
       sigalrm++;
-#if defined(VGPV_arm_linux_android) \
-    || defined(VGPV_x86_linux_android) \
-    || defined(VGPV_mips32_linux_android) \
-    || defined(VGPV_arm64_linux_android)
+#if defined(__BIONIC__)
       /* Android has no pthread_cancel. As it also does not have
          an invoker implementation, there is no need for cleanup action.
          So, we just do nothing. */
diff '--exclude=.git' -ru valgrind-3.13.0/coregrind/vg_preloaded.c /huge-ssd/aosp-arm64/external/valgrind/coregrind/vg_preloaded.c
--- valgrind-3.13.0/coregrind/vg_preloaded.c	2017-05-31 08:14:39.000000000 -0700
+++ /huge-ssd/aosp-arm64/external/valgrind/coregrind/vg_preloaded.c	2017-06-21 14:21:53.515782606 -0700
@@ -58,10 +58,11 @@
 void VG_NOTIFY_ON_LOAD(freeres)(Vg_FreeresToRun to_run)
 {
 #  if !defined(__UCLIBC__) && !defined(MUSL_LIBC) \
+      && !defined(VGPV_amd64_linux_android) \
       && !defined(VGPV_arm_linux_android) \
       && !defined(VGPV_x86_linux_android) \
       && !defined(VGPV_mips32_linux_android) \
-      && !defined(VGPV_arm64_linux_android)
+      && !defined(VGPV_arm64_linux_android) \
 
    /* g++ mangled __gnu_cxx::__freeres yields -> _ZN9__gnu_cxx9__freeresEv */
    extern void _ZN9__gnu_cxx9__freeresEv(void) __attribute__((weak));
diff '--exclude=.git' -ru valgrind-3.13.0/include/pub_tool_libcsetjmp.h /huge-ssd/aosp-arm64/external/valgrind/include/pub_tool_libcsetjmp.h
--- valgrind-3.13.0/include/pub_tool_libcsetjmp.h	2017-05-31 08:14:14.000000000 -0700
+++ /huge-ssd/aosp-arm64/external/valgrind/include/pub_tool_libcsetjmp.h	2017-06-21 14:27:04.766932185 -0700
@@ -128,6 +128,14 @@
 __attribute__((noreturn))
 void  VG_MINIMAL_LONGJMP(VG_MINIMAL_JMP_BUF(_env));
 
+#elif defined(VGPV_arm64_linux_android)
+
+/* Android clang/llvm has no __builtin_{setjmp,longjmp} for aarch64. */
+/* Use the same setjmp/longjmp functions for both gcc and clang.     */
+#define VG_MINIMAL_JMP_BUF(_name) jmp_buf _name
+#define VG_MINIMAL_SETJMP(_env)   ((UWord)(setjmp((_env))))
+#define VG_MINIMAL_LONGJMP(_env)  longjmp((_env),1)
+
 #else
 
 /* The default implementation. */
Only in /huge-ssd/aosp-arm64/external/valgrind/: runtests-arm64.sh
Only in /huge-ssd/aosp-arm64/external/valgrind/: runtests-arm.sh
Only in /huge-ssd/aosp-arm64/external/valgrind/: runtest.sh
Comment 16 Julian Seward 2017-06-25 08:41:33 UTC
(In reply to Elliott Hughes from comment #15)
> x86 is useless because valgrind doesn't
> support SSE2 for x86 so you can't get as far as calling main,

Valgrind supports up to and including SSSE3 on x86 (32 bit).
Comment 17 Elliott Hughes 2017-06-25 15:09:41 UTC
(/me checks check-in comment from https://android-review.googlesource.com/393156...)

yeah, looks like i misremembered: it was SSE4 that was missing. from the bug that links to, it was specifically these instructions:

        for (i = 0; i < len; ++i)
                any_set |= buf[i];
   d2c70:       66 0f 38 21 51 fc       pmovsxbd -0x4(%ecx),%xmm2
   d2c76:       66 0f 38 21 19          pmovsxbd (%ecx),%xmm3

clang has a habit of choosing more exotic instructions that gcc for arm32 too. (that's what's changed recently: the compiler, not the compiler flags.)