SUMMARY On armv7hl (32-bit ARM) memcheck did not notice that "add r15, r6, r10" is a computed goto: r15 is the pc, and writing to the pc causes a jump. The result was memcheck began to emulate itself, leading to bizarre results. STEPS TO REPRODUCE 1. valgrind --smc-check=all ./foo 2. valgrind --trace-flags=10000001 --trace-notbelow=108 ./foo 3. OBSERVED RESULT ==== SB 107 (evchecks 18103) [tid 1] 0x21ec8 UNKNOWN_FUNCTION UNKNOWN_OBJECT+0x0 ==== SB 108 (evchecks 18104) [tid 1] 0x21f0c UNKNOWN_FUNCTION UNKNOWN_OBJECT+0x0 ==== SB 109 (evchecks 18105) [tid 1] 0x4811f74 UNKNOWN_FUNCTION UNKNOWN_OBJECT+0x0 Notice the address jump from 0x21f0c, which is in the program being analyzed, to 0x4811f74. which is in memcheck-arm-linux EXPECTED RESULT "add r15, r6, r10" is recognized as a computed goto. Emulated control should remain inside the user program, or else be diagnosed as a "wild jump" error. ADDITIONAL INFORMATION The program is a programmer-built version of /bin/date which then was compressed by upx. The program is self contained (uses no shared libraries; the static library uClibc.a was used) and the upx-compressed version also is self-contained. It executes correctly (produces the current date on stdout) when run by the shell. Under memcheck, all the executed instructions (so far) are static and unmodified. The computed jump is the transfer to dynamically-written instructions. The --smc-check=all should have caught it. Even without --smc-check=all, memcheck should defend against user code jumping into memcheck. The error occurs quite soon, in block 108. The upx-compressed output (75KB) and console transcripts will be attached if the bugreporting system allows.
Created attachment 115239 [details] target program, compressed by upx
Created attachment 115240 [details] console output of plain memcheck Note parameter --smc-check=all as an attempt to check carefully for non-static code.
Created attachment 115241 [details] console output with --trace-flags valgrind --trace-flags=10000001 --trace-notbelow=108 ./foo 2>&1 | more
Executions were on Fedora 29 beta using valgrind-3.13.0-28.fc29.armv7hl.rpm. The same result was obtained using Fedora 28 (released, prior version of OS) with the same valgrind rpm.
(In reply to John Reiser from comment #0) It's clear that the thing has gone off of the rails somehow, but I'm not sure I agree with your conclusion here: > EXPECTED RESULT > "add r15, r6, r10" is recognized as a computed goto. Emulated control > should remain inside the user program, or else be diagnosed as a "wild jump" > error. If you look at this (from your log) we have (arm) 0x21F20: add r15, r6, r10 ------ IMark(0x21F20, 4, 0) ------ t17 = GET:I32(32) // t17 = value of r6, I assume t19 = GET:I32(48) // t19 = value of r10 t18 = t19 // r18 = value of r10 t20 = Add32(t17,t18) // t20 = r6 + r10 PUT(68) = t20 // VexGuestARMState offset 68 is offset of PC PUT(68) = GET:I32(68) // pointless PUT(68) = GET:I32(68); exit-Boring // "continue at address in (68)" so that looks correct to me. Did I miss something? I must say this stuff was pretty hairy to implement in the ARM front end, so I can believe I got something wrong. OTOH the arm32 front end has been around easily half a decade now without anyone reporting anything like this before.
It would be useful if you could re-run with no instrumentation (--tool=none) and re-post the same logs as before. That has two purposes: first, many fewer generated insns to wade through, and secondly, if the problem disappears then it tells us that the front end is OK and that the problem probably lies in the back end (of the compilation pipeline).
The root cause is a symlink vulnerability! coregrind fails to do the right thing when the target executes int fd_i_am = open("/proc/self/exe", O_RDONLY); upx uses mmap(,,,,fd_i_am,) to replicate portions of the address space of the target. Because coregrind does not virtualize correctly, then upx unfortunately replicates portions of the valgrind tool (memcheck-arm-linux) instead. The simplistic workaround is for coregrind to replace any syscall argument of "/proc/self/exe" with path_to_target. (The complete list of affected syscalls is in the source for 'strace' in its handling of the command-line sequence "-e trace=file".) This workaround fails to recognize "/proc/PID/exe" where PID is the decimal representation of the numeric process ID. The workaround also can fail if the target executes chdir() or chroot(). More generally (and for "defense in depth"), coregrind should protect against any occurrence of "/proc/self/exe" (or equivalent) during pathname resolution by the operating system. At initialization then coregrind should [error checking omitted!] int fd_tool = open("/proc/self/exe", O_RDONLY); /* yes, the tool itself */ struct stat stat_tool; fstat(fd_tool, &stat_tool); int fd_target = open(path_to_target, O_RDONLY); struct stat stat_target; fstat(fd_target, &stat_target); For any successful fd = open() by the target, then coregrind should struct stat stat_fd; fstat(fd, &stat_fd); and compare stat_fd{.st_dev, .st_ino} against stat_tool{.st_dev, .st_ino} to detect the need for virtualization. If so, then close(fd) and set fd = dup(fd_target); System calls other than open() probably are less important, but in theory deserve similar care. For instance, chmod would use open+fstat+check{.st_dev, .st_ino}+fchmod+close. Another approach is to rely on resolution of path_to_target into an absolute path. On linux: int fd = open(path_to_target, O_RDONLY); char abs_path[PATH_MAX]; char tmp[32]; snprintf(tmp, sizeof(tmp), "/proc/self/fd/%d", fd); int n = readlink(tmp, abs_path, -1+ sizeof(abs_path)); abs_path[n] = '\0'; /* ERROR CHECKING OMITTTED */ I consider the {.st_dev, .st_ino} technique to be better.
We do handle /proc/self/exe and /proc/PID/exe in a few places but in general I think it was felt unfeasible to try and isolate every places where they might appear and emulate them given that it's a relatively unusual thing to happen. Basically it's just readlink and readlinkat that we handle by the looks of if because that was what we most commonly saw people doing. I think there is another bug somewhere where somebody was doing open or something and it was going wrong.
Oddly it seems NEWS.old claims that support was added for both open and readlink in 2.4.0 but I'm not sure that's true, or if it is then I think open has been lost...
The readlink support was introduced in https://sourceware.org/git/gitweb.cgi?p=valgrind.git;a=commitdiff;h=423bfef15b2e275314263fea8af11a29d5509110 (later extended to readlinkat) and the NEWS entry in https://sourceware.org/git/gitweb.cgi?p=valgrind.git;a=commitdiff;h=55b62aeffbf1e864f00bbfb285661cdf84db23d8 but I can't see any evidence that open was ever supported despite what it says there.