Bug 446123 - Segfault when running a program that calls backtrace() [ppc64le]
Summary: Segfault when running a program that calls backtrace() [ppc64le]
Status: REPORTED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (other bugs)
Version First Reported In: 3.18.1
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
: 434849 (view as bug list)
Depends on:
Blocks:
 
Reported: 2021-11-26 16:50 UTC by Jesus Checa
Modified: 2021-11-29 15:38 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jesus Checa 2021-11-26 16:50:40 UTC
Packages used:
valgrind-3.18.1-1.el9.ppc64le
gcc-11.2.1-6.el9.ppc64le
glibc-2.34-8.el9.ppc64le
kernel-5.14.0-17.el9.ppc64le

Reproducer:
--------------------------------
#include <stdio.h>
#include <execinfo.h>

void call_backtrace(){
        void * callstack[128];
        backtrace(callstack, 128);
}

int main(){
    call_backtrace();
    return 0;
}
--------------------------------

Description of the problem:

I'm seeing a segfault when running a program that calls backtrace() under valgrind, but not when I run it standalone. The segfault is ppc64le specific, I doesn't reproduce in other platforms. It seems related to the fact that valgrind removes the vDSO from the process, but glibc is still checking symbols related to it.
The glibc code related to the segfault is the following (sysdeps/powerpc/powerpc64/backtrace.c)
--------------------------------
67      static inline bool
68      is_sigtramp_address (void *nip)
69      {
70      #ifdef HAVE_SIGTRAMP_RT64
71        if (nip == GLRO (dl_vdso_sigtramp_rt64) ||
72            nip == GLRO (dl_vdso_sigtramp_rt64) + 4)
73          return true;
74      #endif
75        return false;
76      }
77
78      int
79      __backtrace (void **array, int size)
80      {
81        struct layout *current;
82        int count;
83
84        /* Force gcc to spill LR.  */
85        asm volatile ("" : "=l"(current));
86
87        /* Get the address on top-of-stack.  */
88        asm volatile ("ld %0,0(1)" : "=r"(current));
89
90        for (                         count = 0;
91             current != NULL &&       count < size;
92             current = current->next, count++)
93          {
94            array[count] = current->return_address;
95
96            /* Check if the symbol is the signal trampoline and get the interrupted
97             * symbol address from the trampoline saved area.  */
98            if (is_sigtramp_address (current->return_address))
99              {
100               struct signal_frame_64 *sigframe = (struct signal_frame_64*) current;
101               if (count + 1 == size)
102                 break;
103               array[++count] = (void*) sigframe->uc.uc_mcontext.gp_regs[PT_NIP];
104               current = (void*) sigframe->uc.uc_mcontext.gp_regs[PT_R1];
105             }
106         }
----------------------------------------------------------------------

In each stack frame the function checks if the return address corresponds to the signal trampoline by calling is_sigtramp_address (lines 98, 71).
When the program reaches the first of the frames, the return_address is 0x0. Since valgrind is not holding any vDSO information, the symbol that holds the signal trampoline is NULL, which causes that is_sigtramp_address returns true (when it should return false). Then the block in line 99 is executed, filling "current" with invalid data that will point to invalid memory and cause the segfault. I ran this using gdb/vgdb:

(gdb) target remote | vgdb
Remote debugging using | vgdb
relaying data between gdb and process 33117
warning: remote target does not support file transfer, attempting to access files from local filesystem.
Reading symbols from /lib64/ld64.so.2...
0x00000000040016e0 in _start () from /lib64/ld64.so.2
(gdb) b backtrace.c:94
No source file named backtrace.c.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (backtrace.c:94) pending.
(gdb) c
Continuing.

Breakpoint 1, __GI___backtrace (array=0x1fff00e540, size=128) at ../sysdeps/powerpc/powerpc64/backtrace.c:94
94            array[count] = current->return_address;
(gdb) p current->return_address
$1 = (void *) 0x10000678 <call_backtrace+44>
(gdb) c
Continuing.

Breakpoint 1, __GI___backtrace (array=0x1fff00e540, size=128) at ../sysdeps/powerpc/powerpc64/backtrace.c:94
94            array[count] = current->return_address;
(gdb) p current->return_address
$2 = (void *) 0x100006c0 <main+32>
(gdb) c
Continuing.

Breakpoint 1, __GI___backtrace (array=0x1fff00e540, size=128) at ../sysdeps/powerpc/powerpc64/backtrace.c:94
94            array[count] = current->return_address;
(gdb) p current->return_address
$3 = (void *) 0x4127ca4 <__libc_start_call_main+148>
(gdb) c
Continuing.

Breakpoint 1, __GI___backtrace (array=0x1fff00e540, size=128) at ../sysdeps/powerpc/powerpc64/backtrace.c:94
94            array[count] = current->return_address;
(gdb) p current->return_address
$4 = (void *) 0x4127e80 <__libc_start_main_impl+336>
(gdb) c
Continuing.

Breakpoint 1, __GI___backtrace (array=0x1fff00e540, size=128) at ../sysdeps/powerpc/powerpc64/backtrace.c:94
94            array[count] = current->return_address;
(gdb) p current->return_address
$5 = (void *) 0x0
(gdb) n
98            if (is_sigtramp_address (current->return_address))
(gdb) s
0x0000000004263b08 in is_sigtramp_address (nip=<optimized out>) at ../sysdeps/powerpc/powerpc64/backtrace.c:71
71        if (nip == GLRO (dl_vdso_sigtramp_rt64) ||
(gdb) s
98            if (is_sigtramp_address (current->return_address))
(gdb) n
103               array[++count] = (void*) sigframe->uc.uc_mcontext.gp_regs[PT_NIP];
(gdb) n
92             current = current->next, count++)
(gdb) n
91             current != NULL &&       count < size;
(gdb) n

Breakpoint 1, __GI___backtrace (array=0x1fff00e540, size=128) at ../sysdeps/powerpc/powerpc64/backtrace.c:94
94            array[count] = current->return_address;
(gdb) p current->return_address
Cannot access memory at address 0x525f52454b414552
Comment 1 Mark Wielaard 2021-11-29 15:19:41 UTC
*** Bug 434849 has been marked as a duplicate of this bug. ***
Comment 2 Tulio Magno Quites Machado Filho 2021-11-29 15:38:58 UTC
Notice that glibc 2.35 changed the behavior of backtrace and this issue should not happen anymore.
Anyway, the lack of vdso causes a difference of behavior when running on top valgrind versus outside of valgrind making it very hard to use valgrind to detect issues in those scenarios.
So what I mean is: even with glibc 2.35 there is still value in providing the VDSO.