| Summary: | WARNING: unhandled amd64-linux syscall: 451 (cachestat) | ||
|---|---|---|---|
| Product: | [Developer tools] valgrind | Reporter: | mcermak |
| Component: | general | Assignee: | mcermak |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | mark |
| Priority: | NOR | ||
| Version First Reported In: | 3.25 GIT | ||
| Target Milestone: | --- | ||
| Platform: | Other | ||
| OS: | Linux | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
| Attachments: |
proposed patch
proposed patch |
||
Created attachment 179728 [details]
proposed patch
Attached patch compiles without warnings and works like this:
$ vg-in-place --trace-children=no --trace-syscalls=yes --tool=none ~/cachestat03 2>&1 | fgrep sys_cachestat
SYSCALL[217211,1](451) sys_cachestat ( 4294967295, 0x483dff0, 0x483bfd8, 0 )[sync] --> Failure(0x9)
SYSCALL[217211,1](451) sys_cachestat ( 3, 0x0, 0x483bfd8, 0 )SYSCALL[217208,1](37) sys_alarm ( 30 )[sync] --> Success(0x1e) [sync] --> Failure(0xe)
SYSCALL[217211,1](451) sys_cachestat ( 3, 0x483dff0, 0x0, 0 )SYSCALL[217208,1](37) [sync] --> Failure(0xe)sys_alarm ( 30 )
SYSCALL[217211,1](451) sys_cachestat ( 3, 0x483dff0, 0x483bfd8, 4294967295 )SYSCALL[217208,1](37) sys_alarm ( 30 )[sync] --> Failure(0x16) [sync] --> Success(0x1e)
SYSCALL[217211,1](451) sys_cachestat ( 4, 0x483dff0, 0x483bfd8, 0 )SYSCALL[217208,1](37) [sync] --> Failure(0x5f)
$
For the record, here's the syscall body:
SYSCALL_DEFINE4(cachestat, unsigned int, fd,
struct cachestat_range __user *, cstat_range,
struct cachestat __user *, cstat, unsigned int, flags)
{
CLASS(fd, f)(fd);
struct address_space *mapping;
struct cachestat_range csr;
struct cachestat cs;
pgoff_t first_index, last_index;
if (fd_empty(f))
return -EBADF;
if (copy_from_user(&csr, cstat_range,
sizeof(struct cachestat_range)))
return -EFAULT;
/* hugetlbfs is not supported */
if (is_file_hugepages(fd_file(f)))
return -EOPNOTSUPP;
if (!can_do_cachestat(fd_file(f)))
return -EPERM;
if (flags != 0)
return -EINVAL;
first_index = csr.off >> PAGE_SHIFT;
last_index =
csr.len == 0 ? ULONG_MAX : (csr.off + csr.len - 1) >> PAGE_SHIFT;
memset(&cs, 0, sizeof(struct cachestat));
mapping = fd_file(f)->f_mapping;
filemap_cachestat(mapping, first_index, last_index, &cs);
if (copy_to_user(cstat, &cs, sizeof(struct cachestat)))
return -EFAULT;
return 0;
}
This looks mostly good. But there is a missing fd check in the PRE handler
and in the POST handler shouldn't (recheck) the fd, but mark the cstat as having been written.
cachestat takes an fd, cstat_range and flags, which are all checked in the PRE handler as being fully defined.
And it will write out the cstat, which is being checked and prepared for writing by the kernel.
What is missing is checking the fd is valid in the PRE handler. You want to add something like:
if (!ML_(fd_allowed)(ARG1, "cachestat", tid, False))
SET_STATUS_Failure( VKI_EBADF );
In the POST handler the fd is checked again (*) but given this is an input, not an output, this isn't necessary .
What is necessary is to mark the cstat struct as being defined now (by the kernel). You should do that with
something like: POST_MEM_WRITE(ARG3, sizeof(struct vki_cachestat));
(*) The check for fd being allowed in the POST handler is doubly wrong. ARG1 isn't a pointer to the fd, but the fd itself.
If you do want to dereference a pointer then you have to be extra careful. You want to make sure you can dereference
the pointer using something like: if (ML_(safe_to_deref)((void*)ARG1, sizeof(Int))) { ... }
Created attachment 181321 [details]
proposed patch
Hi Mark, thank you for your review, makes sense! I'm attaching an updated patch here. Here is how it tests using a LTP-generated reproducer file (using sudo so that the syscall can actually work):
41$ sudo ./vg-in-place --trace-children=no --trace-syscalls=yes --tool=none ./cachestat03 |& fgrep sys_cachestat
SYSCALL[2498243,1](451) sys_cachestat ( 4294967295, 0x4841ff0, 0x483ffd8, 0 ) --> [pre-fail] Failure(0x9)
SYSCALL[2498243,1](451) sys_cachestat ( 3, 0x0, 0x483ffd8, 0 )[sync] --> Failure(0xe)
SYSCALL[2498243,1](451) sys_cachestat ( 3, 0x4841ff0, 0x0, 0 )[sync] --> Failure(0xe)
SYSCALL[2498243,1](451) sys_cachestat ( 3, 0x4841ff0, 0x483ffd8, 4294967295 )[sync] --> Failure(0x16)
SYSCALL[2498243,1](451) sys_cachestat ( 4, 0x4841ff0, 0x483ffd8, 0 )[sync] --> Failure(0x5f)
41$
Looks perfect. Also quickly tested against the ltp cachestat tests. All PASS now. Pushed as: commit 41f0f95d9415faa5f76fcab92f45ca05957c4032 Author: Martin Cermak <mcermak@redhat.com> Date: Thu May 15 12:03:55 2025 +0200 Wrap linux specific cachestat syscall cachestat takes an fd, cstat_range and flags arguments and writes out page cache statistics via the cstat struct. Declare a sys_cachestat wrapper in priv_syswrap-linux.h and hook it for {amd64,arm,arm64,mips64,nanomips,ppc32,ppc64,riscv64,s390x,x86}-linux using LINXY with PRE/POST handlers in syswrap-linux.c. Define __NR_cachestat for amd64, arm, arm64, mips32, mips64, nanomips, ppc32, ppc64, riscv64, s390x, and x86. https://bugs.kde.org/show_bug.cgi?id=501741 |
On rhel10 x86_64, aarch64, ppc64le and s390x I see valgrind complaining about "WARNING: unhandled amd64-linux syscall: 451". This appears to be a new cachestat syscall. No record of it in fedora man-pages yet, but mm/filemap.c reads: /* * The cachestat(2) system call. * * cachestat() returns the page cache statistics of a file in the * bytes range specified by `off` and `len`: number of cached pages, * number of dirty pages, number of pages marked for writeback, * number of evicted pages, and number of recently evicted pages. * * An evicted page is a page that is previously in the page cache * but has been evicted since. A page is recently evicted if its last * eviction was recent enough that its reentry to the cache would * indicate that it is actively being used by the system, and that * there is memory pressure on the system. * * `off` and `len` must be non-negative integers. If `len` > 0, * the queried range is [`off`, `off` + `len`]. If `len` == 0, * we will query in the range from `off` to the end of the file. * * The `flags` argument is unused for now, but is included for future * extensibility. User should pass 0 (i.e no flag specified). * * Currently, hugetlbfs is not supported. * * Because the status of a page can change after cachestat() checks it * but before it returns to the application, the returned values may * contain stale information. * * return values: * zero - success * -EFAULT - cstat or cstat_range points to an illegal address * -EINVAL - invalid flags * -EBADF - invalid file descriptor * -EOPNOTSUPP - file descriptor is of a hugetlbfs file */ SYSCALL_DEFINE4(cachestat, unsigned int, fd, struct cachestat_range __user *, cstat_range, struct cachestat __user *, cstat, unsigned int, flags) { ... On all the mentioned arches this syscall seems to have the same number (grep kernel-devel's scripts/syscall.tbl) 451 common cachestat sys_cachestat Reproduced with valgrind-3.25.0.GIT.