Bug 502968 - Unsupported system call 457 (listmount) and 458 (statmount)
Summary: Unsupported system call 457 (listmount) and 458 (statmount)
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (other bugs)
Version First Reported In: 3.24.0
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: mcermak
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-04-18 14:25 UTC by Zygmunt Krynicki
Modified: 2025-06-27 21:21 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
proposed patch (13.96 KB, patch)
2025-06-19 14:45 UTC, mcermak
Details
proposed patch (14.72 KB, patch)
2025-06-25 13:10 UTC, mcermak
Details
proposed patch (14.53 KB, patch)
2025-06-27 20:43 UTC, mcermak
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zygmunt Krynicki 2025-04-18 14:25:05 UTC
SUMMARY

Run valgrind against a program that uses the listmount and statmount system calls.

STEPS TO REPRODUCE
1.  Compile one of the kernel or manual page exammples that call SYS_listmount and SYS_statmount
2.  See it complain about unknown system call

OBSERVED RESULT


EXPECTED RESULT


SOFTWARE/OS VERSIONS
Linux 6.14.2-300.fc42.x86_64 (Fedora 42) 
Valgrind 3.24.0

ADDITIONAL INFORMATION
https://man7.org/linux/man-pages//man2/listmount.2.html
https://man7.org/linux/man-pages//man2/statmount.2.html
Comment 1 Mark Wielaard 2025-05-09 12:54:02 UTC
listmount and statmount aren't wrapped for any architecture in 3.25.0 (or current git)
Comment 2 mcermak 2025-06-19 14:45:09 UTC
Created attachment 182386 [details]
proposed patch
Comment 3 Zygmunt Krynicki 2025-06-19 15:59:55 UTC
Thank you for writing this. I made a simple program using both syscalls at https://gitlab.com/zygoon/mount-insight it may be used to sanity check the patch as the system calls are still relatively rare.
Comment 4 mcermak 2025-06-20 15:02:22 UTC
Thanks Zygmunt.  Seems like it's working fine with your code on Fedora-42:

---------------------------------------8<---------------------------------------------------------------
$ ./vg-in-place ../mount-insight/mount-insight  |& tail
}
==245624== 
==245624== HEAP SUMMARY:
==245624==     in use at exit: 0 bytes in 0 blocks
==245624==   total heap usage: 3 allocs, 3 frees, 10,176 bytes allocated
==245624== 
==245624== All heap blocks were freed -- no leaks are possible
==245624== 
==245624== For lists of detected and suppressed errors, rerun with: -s
==245624== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
$ echo $?
0
$ 
$ ./vg-in-place --tool=none --trace-syscalls=yes ../mount-insight/mount-insight |& grep -e ^SYSCALL.*sys_statmount -e ^SYSCALL.*sys_listmount | head
SYSCALL[245720,1](458) sys_listmount ( 0x1ffefffc20, 0x1ffeffdce0, 1000,  0x0)[sync] --> Success(0x14a) 
SYSCALL[245720,1](457) sys_statmount ( 0x1ffefffc20, 0x1ffeffcce0, 4096,  0x0)[sync] --> Success(0x0) 
SYSCALL[245720,1](457) sys_statmount ( 0x1ffefffc20, 0x1ffeffcce0, 4096,  0x0)[sync] --> Success(0x0) 
SYSCALL[245720,1](457) sys_statmount ( 0x1ffefffc20, 0x1ffeffcce0, 4096,  0x0)[sync] --> Success(0x0) 
SYSCALL[245720,1](457) sys_statmount ( 0x1ffefffc20, 0x1ffeffcce0, 4096,  0x0)[sync] --> Success(0x0) 
SYSCALL[245720,1](457) sys_statmount ( 0x1ffefffc20, 0x1ffeffcce0, 4096,  0x0)[sync] --> Success(0x0) 
SYSCALL[245720,1](457) sys_statmount ( 0x1ffefffc20, 0x1ffeffcce0, 4096,  0x0)[sync] --> Success(0x0) 
SYSCALL[245720,1](457) sys_statmount ( 0x1ffefffc20, 0x1ffeffcce0, 4096,  0x0)[sync] --> Success(0x0) 
SYSCALL[245720,1](457) sys_statmount ( 0x1ffefffc20, 0x1ffeffcce0, 4096,  0x0)[sync] --> Success(0x0) 
SYSCALL[245720,1](457) sys_statmount ( 0x1ffefffc20, 0x1ffeffcce0, 4096,  0x0)[sync] --> Success(0x0) 
$ 
$ 
$ strace -qq -elistmount,statmount --raw=all --signal='!all' -f -o /tmp/xxx ../mount-insight/mount-insight >&/dev/null; echo "-------------"; head /tmp/xxx
-------------
245973 listmount(0x7fff60ba1fc0, 0x7fff60ba0080, 0x3e8, 0) = 0x14a
245973 statmount(0x7fff60ba1fc0, 0x7fff60b9f080, 0x1000, 0) = 0
245973 statmount(0x7fff60ba1fc0, 0x7fff60b9f080, 0x1000, 0) = 0
245973 statmount(0x7fff60ba1fc0, 0x7fff60b9f080, 0x1000, 0) = 0
245973 statmount(0x7fff60ba1fc0, 0x7fff60b9f080, 0x1000, 0) = 0
245973 statmount(0x7fff60ba1fc0, 0x7fff60b9f080, 0x1000, 0) = 0
245973 statmount(0x7fff60ba1fc0, 0x7fff60b9f080, 0x1000, 0) = 0
245973 statmount(0x7fff60ba1fc0, 0x7fff60b9f080, 0x1000, 0) = 0
245973 statmount(0x7fff60ba1fc0, 0x7fff60b9f080, 0x1000, 0) = 0
245973 statmount(0x7fff60ba1fc0, 0x7fff60b9f080, 0x1000, 0) = 0
$
---------------------------------------8<---------------------------------------------------------------
Comment 5 Mark Wielaard 2025-06-24 12:53:15 UTC
> From 3b6d22623c3fd9f22b47957b2c3255a4517e3d69 Mon Sep 17 00:00:00 2001
> From: Martin Cermak <mcermak@redhat.com>
> Date: Thu, 19 Jun 2025 16:39:24 +0200
> Subject: [PATCH] Wrap linux specific syscalls 457 (listmount) and 458 (statmount)
>
> The listmount syscall returns a list of mount IDs under the req.mnt_id.
> This is meant to be used in conjuction with statmount(2) in order to
> provide a way to iterate and discover mounted file systems.

s/conjuction/conjunction/

> The statmount syscall returns information about a mount, storing it in
> the buffer pointed to by smbuf.  The returned buffer is a struct
> statmount which is of size bufsize with the fields filled in as
> described below.

Described below?

> Declare a sys_{lis,sta}tmount wrapper in priv_syswrap-linux.h and hook it
> for {amd64,arm,arm64,mips64,nanomips,ppc32,ppc64,riscv64,s390x,x86}-linux
> using LINXY with PRE and POST handler in syswrap-linux.c

Which is simple because the syscall numbers are shared between arches.

> Both syscalls need CAP_SYS_ADMIN, to successfully test.

Aha, I was wondering about that.

Please also add https://bugs.kde.org/show_bug.cgi?id=502968 to the commit message

> diff --git a/NEWS b/NEWS
> index 97e4b3b41..837f1e0a6 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -42,6 +42,7 @@ are not entered into bugzilla tend to get forgotten about or ignored.
>  504936  Add FreeBSD amd64 sysarch subcommands AMD64_SET_TLSBASE and
>          AMD64_GET_TLSBASE
>  505228  Wrap linux specific mseal syscall
>+502968  Wrap linux specific syscalls 457 (listmount) and 458 (statmount)

Ack, thanks.

> diff --git a/coregrind/m_syswrap/priv_syswrap-linux.h b/coregrind/m_syswrap/priv_syswrap-linux.h
> index ed8cb4ed5..9e6cb8981 100644
> --- a/coregrind/m_syswrap/priv_syswrap-linux.h
> +++ b/coregrind/m_syswrap/priv_syswrap-linux.h
> @@ -355,6 +355,10 @@ DECL_TEMPLATE(linux, sys_pidfd_getfd);
>  // Since Linux 6.6
>  DECL_TEMPLATE(linux, sys_fchmodat2);
>  
> +// Since Linux 6.8
> +DECL_TEMPLATE(linux, sys_listmount);
> +DECL_TEMPLATE(linux, sys_statmount);

OK.
 
> +   LINXY(__NR_statmount,         sys_statmount),         // 457
> +   LINXY(__NR_listmount,         sys_listmount),         // 458

Looks ok for all arches.
 
> diff --git a/coregrind/m_syswrap/syswrap-linux.c b/coregrind/m_syswrap/syswrap-linux.c
> index 0db871778..18be21346 100644
> --- a/coregrind/m_syswrap/syswrap-linux.c
> +++ b/coregrind/m_syswrap/syswrap-linux.c
> @@ -4305,6 +4305,50 @@ PRE(sys_mseal)
>         SET_STATUS_Failure(VKI_ENOMEM);
>  }
> 
> +PRE(sys_statmount)
> +{
> +   // int syscall(SYS_statmount, struct mnt_id_req *req,
> +   //             struct statmount *smbuf, size_t bufsize,
> +   //             unsigned long flags);

I think we may want a:
*flags |= SfMayBlock;
or maybe just FUSE_COMPATIBLE_MAY_BLOCK();
since it might require switching threads/block when the file system is in userspace/this process.

> +   PRINT("sys_statmount ( %#" FMT_REGWORD "x, %#" FMT_REGWORD "x, %lu,  %#" FMT_REGWORD "x)", ARG1, ARG2, ARG3, ARG4);
> +   PRE_REG_READ4(long, "statmount", struct vki_mnt_id_req *, req, struct vki_statmount *, smbuf, vki_size_t, bufsize, unsigned long, flags);
> +   if (ARG1 != 0) {
> +      PRE_MEM_READ("statmount(req)", ARG1, sizeof(struct vki_mnt_id_req));
> +   }

Is req allowed to me NULL? Does statmount function correctly if it is?

> +   if ((ARG2 != 0) && (ARG3 > 0)) {
> +      PRE_MEM_WRITE("statmount(smbuf)", ARG2, ARG3);
> +   }

Same question for smbuf.

If not, then producing a warning here is the right thing to do (instead of ignoring it).

> +}
> +
+POST(sys_statmount)
+{
+   if ((ARG2 != 0) && (ARG3 > 0)) {
+      POST_MEM_WRITE(ARG2, ARG3);
+   }
+}

Same question as above. Also I think this should be something like:

struct vki_statmount *smbuf = (struct vki_statmount *)(Addr)ARG2;
POST_MEM_WRITE( (Addr)smbuf, smbuf->size );

Since the kernel only seems to guarantee to fill in smbuf up to that size.

Sorry, have to return to the rest of the review later. But hopefully this already helps.
Comment 6 Mark Wielaard 2025-06-24 21:36:54 UTC
Review part 2:

> +PRE(sys_listmount)
> +{
> +   // int syscall(SYS_listmount, struct mnt_id_req *req,
> +   //             uint64_t *mnt_ids, size_t nr_mnt_ids,
> +   //             unsigned long flags);
> +   PRINT("sys_listmount ( %#" FMT_REGWORD "x, %#" FMT_REGWORD "x, %lu,  %#" FMT_REGWORD "x)", ARG1, ARG2, ARG3, ARG4);
> +   PRE_REG_READ4(long, "listmount", struct vki_mnt_id_req *, req, vki_uint64_t *, mnt_ids, vki_size_t, nr_mnt_ids, unsigned long, flags);

OK.
Like with statmount we might want to use *flags |= SfMayBlock;
or maybe FUSE_COMPATIBLE_MAY_BLOCK();

> +   if (ARG1 != 0) {
> +      PRE_MEM_READ("listmount(req)", ARG1, sizeof(struct vki_mnt_id_req));
> +   }
> +   if ((ARG2 != 0) && (ARG3 > 0)) {
> +      PRE_MEM_WRITE("listmount(mnt_ids)", ARG2, ARG3 * sizeof(vki_uint64_t));
> +   }

As with statmount, are ARG1 and ARG2 allowed to be NULL?
If not then just always do the check.

> +}
> +
> +POST(sys_listmount)
> +{
> +   if ((ARG2 != 0) && (ARG3 > 0)) {
> +      POST_MEM_WRITE(ARG2, ARG3 * sizeof(vki_uint64_t));
> +   }
> +}

Likewise.
And I think it should be POST_MEM_WRITE(ARG2, RES * sizeof(vki_uint64_t));
ARG3 is the maximum, RES is the actual number of mnt_ids put in.

> +struct vki_mnt_id_req {
> +   __vki_u32 size;
> +   __vki_u32 spare;
> +   __vki_u64 mnt_id;
> +   __vki_u64 param;
> +   __vki_u64 mnt_ns_id;
> +};

O. I see what that size field is for. There are multiple versions of
this struct. The second variant is bigger than the first (has the
mnt_ns_id field added). See
https://lore.kernel.org/all/49930bdce29a8367a213eb14c1e68e7e49284f86.1719243756.git.josef@toxicpanda.com/

So where we do
PRE_MEM_READ("statmount(req)", ARG1, sizeof(struct vki_mnt_id_req));
or
PRE_MEM_READ("listmount(req)", ARG1, sizeof(struct vki_mnt_id_req));

We really should check just the struct size from user space:

struct vki_mnt_id_req *req = (struct vki_mnt_id_req *)(Addr)ARG1;
PRE_MEM_READ("statmount(req)", ARG1, req->size);

> +struct vki_statmount {
> +     __vki_u32 size;         /* Total size, including strings */
> +     __vki_u32 mnt_opts;             /* [str] Mount options of the mount */
> +     __vki_u64 mask;         /* What results were written */
> +     __vki_u32 sb_dev_major; /* Device ID */
> +     __vki_u32 sb_dev_minor;
> +     __vki_u64 sb_magic;             /* ..._SUPER_MAGIC */
> +     __vki_u32 sb_flags;             /* SB_{RDONLY,SYNCHRONOUS,DIRSYNC,LAZYTIME} */
> +     __vki_u32 fs_type;              /* [str] Filesystem type */
> +     __vki_u64 mnt_id;               /* Unique ID of mount */
> +     __vki_u64 mnt_parent_id;        /* Unique ID of parent (for root == mnt_id) */
> +     __vki_u32 mnt_id_old;   /* Reused IDs used in proc/.../mountinfo */
> +     __vki_u32 mnt_parent_id_old;
> +     __vki_u64 mnt_attr;             /* MOUNT_ATTR_... */
> +     __vki_u64 mnt_propagation;      /* MS_{SHARED,SLAVE,PRIVATE,UNBINDABLE} */
> +     __vki_u64 mnt_peer_group;       /* ID of shared peer group */
> +     __vki_u64 mnt_master;   /* Mount receives propagation from this ID */
> +     __vki_u64 propagate_from;       /* Propagation from in current namespace */
> +     __vki_u32 mnt_root;             /* [str] Root of mount relative to root of fs */
> +     __vki_u32 mnt_point;    /* [str] Mountpoint relative to current root */
> +     __vki_u64 mnt_ns_id;    /* ID of the mount namespace */
> +     __vki_u64 __spare2[49];
> +     char str[];             /* Variable size part containing strings */
> +};

OK.

> diff --git a/include/vki/vki-scnums-shared-linux.h b/include/vki/vki-scnums-shared-linux.h
> index 32ef8ac13..85bf9d171 100644
> --- a/include/vki/vki-scnums-shared-linux.h
> +++ b/include/vki/vki-scnums-shared-linux.h
> @@ -56,6 +56,8 @@
 
>  #define __NR_cachestat               451
>  #define __NR_fchmodat2               452
> +#define __NR_statmount          457
> +#define __NR_listmount          458
>  #define __NR_mseal           462

Looks good.
Comment 7 mcermak 2025-06-25 13:10:06 UTC
Created attachment 182650 [details]
proposed patch

(In reply to Mark Wielaard from comment #5)
> > From 3b6d22623c3fd9f22b47957b2c3255a4517e3d69 Mon Sep 17 00:00:00 2001
> > From: Martin Cermak <mcermak@redhat.com>
> > Date: Thu, 19 Jun 2025 16:39:24 +0200
> > Subject: [PATCH] Wrap linux specific syscalls 457 (listmount) and 458 (statmount)
> >
> > The listmount syscall returns a list of mount IDs under the req.mnt_id.
> > This is meant to be used in conjuction with statmount(2) in order to
> > provide a way to iterate and discover mounted file systems.
> 
> s/conjuction/conjunction/

fixed

> > The statmount syscall returns information about a mount, storing it in
> > the buffer pointed to by smbuf.  The returned buffer is a struct
> > statmount which is of size bufsize with the fields filled in as
> > described below.
> 
> Described below?

fixed

> > Declare a sys_{lis,sta}tmount wrapper in priv_syswrap-linux.h and hook it
> > for {amd64,arm,arm64,mips64,nanomips,ppc32,ppc64,riscv64,s390x,x86}-linux
> > using LINXY with PRE and POST handler in syswrap-linux.c
> 
> Which is simple because the syscall numbers are shared between arches.
> 
> > Both syscalls need CAP_SYS_ADMIN, to successfully test.
> 
> Aha, I was wondering about that.
> 
> Please also add https://bugs.kde.org/show_bug.cgi?id=502968 to the commit
> message
> 
> > diff --git a/NEWS b/NEWS
> > index 97e4b3b41..837f1e0a6 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -42,6 +42,7 @@ are not entered into bugzilla tend to get forgotten about or ignored.
> >  504936  Add FreeBSD amd64 sysarch subcommands AMD64_SET_TLSBASE and
> >          AMD64_GET_TLSBASE
> >  505228  Wrap linux specific mseal syscall
> >+502968  Wrap linux specific syscalls 457 (listmount) and 458 (statmount)
> 
> Ack, thanks.
> 
> > diff --git a/coregrind/m_syswrap/priv_syswrap-linux.h b/coregrind/m_syswrap/priv_syswrap-linux.h
> > index ed8cb4ed5..9e6cb8981 100644
> > --- a/coregrind/m_syswrap/priv_syswrap-linux.h
> > +++ b/coregrind/m_syswrap/priv_syswrap-linux.h
> > @@ -355,6 +355,10 @@ DECL_TEMPLATE(linux, sys_pidfd_getfd);
> >  // Since Linux 6.6
> >  DECL_TEMPLATE(linux, sys_fchmodat2);
> >  
> > +// Since Linux 6.8
> > +DECL_TEMPLATE(linux, sys_listmount);
> > +DECL_TEMPLATE(linux, sys_statmount);
> 
> OK.
>  
> > +   LINXY(__NR_statmount,         sys_statmount),         // 457
> > +   LINXY(__NR_listmount,         sys_listmount),         // 458
> 
> Looks ok for all arches.
>  
> > diff --git a/coregrind/m_syswrap/syswrap-linux.c b/coregrind/m_syswrap/syswrap-linux.c
> > index 0db871778..18be21346 100644
> > --- a/coregrind/m_syswrap/syswrap-linux.c
> > +++ b/coregrind/m_syswrap/syswrap-linux.c
> > @@ -4305,6 +4305,50 @@ PRE(sys_mseal)
> >         SET_STATUS_Failure(VKI_ENOMEM);
> >  }
> > 
> > +PRE(sys_statmount)
> > +{
> > +   // int syscall(SYS_statmount, struct mnt_id_req *req,
> > +   //             struct statmount *smbuf, size_t bufsize,
> > +   //             unsigned long flags);
> 
> I think we may want a:
> *flags |= SfMayBlock;
> or maybe just FUSE_COMPATIBLE_MAY_BLOCK();
> since it might require switching threads/block when the file system is in
> userspace/this process.

OK, understood why this is needed, added!

> > +   PRINT("sys_statmount ( %#" FMT_REGWORD "x, %#" FMT_REGWORD "x, %lu,  %#" FMT_REGWORD "x)", ARG1, ARG2, ARG3, ARG4);
> > +   PRE_REG_READ4(long, "statmount", struct vki_mnt_id_req *, req, struct vki_statmount *, smbuf, vki_size_t, bufsize, unsigned long, flags);
> > +   if (ARG1 != 0) {
> > +      PRE_MEM_READ("statmount(req)", ARG1, sizeof(struct vki_mnt_id_req));
> > +   }
> 
> Is req allowed to me NULL? Does statmount function correctly if it is?
> 
> > +   if ((ARG2 != 0) && (ARG3 > 0)) {
> > +      PRE_MEM_WRITE("statmount(smbuf)", ARG2, ARG3);
> > +   }
> 
> Same question for smbuf.
> 
> If not, then producing a warning here is the right thing to do (instead of
> ignoring it).

Understood, added warnings.

> > +}
> > +
> +POST(sys_statmount)
> +{
> +   if ((ARG2 != 0) && (ARG3 > 0)) {
> +      POST_MEM_WRITE(ARG2, ARG3);
> +   }
> +}
> 
> Same question as above. Also I think this should be something like:
> 
> struct vki_statmount *smbuf = (struct vki_statmount *)(Addr)ARG2;
> POST_MEM_WRITE( (Addr)smbuf, smbuf->size );

Understood, fixed.

> Since the kernel only seems to guarantee to fill in smbuf up to that size.
> 
> Sorry, have to return to the rest of the review later. But hopefully this
> already helps.

(In reply to Mark Wielaard from comment #6)
> Review part 2:
> 
> > +PRE(sys_listmount)
> > +{
> > +   // int syscall(SYS_listmount, struct mnt_id_req *req,
> > +   //             uint64_t *mnt_ids, size_t nr_mnt_ids,
> > +   //             unsigned long flags);
> > +   PRINT("sys_listmount ( %#" FMT_REGWORD "x, %#" FMT_REGWORD "x, %lu,  %#" FMT_REGWORD "x)", ARG1, ARG2, ARG3, ARG4);
> > +   PRE_REG_READ4(long, "listmount", struct vki_mnt_id_req *, req, vki_uint64_t *, mnt_ids, vki_size_t, nr_mnt_ids, unsigned long, flags);
> 
> OK.
> Like with statmount we might want to use *flags |= SfMayBlock;
> or maybe FUSE_COMPATIBLE_MAY_BLOCK();
> 
> > +   if (ARG1 != 0) {
> > +      PRE_MEM_READ("listmount(req)", ARG1, sizeof(struct vki_mnt_id_req));
> > +   }
> > +   if ((ARG2 != 0) && (ARG3 > 0)) {
> > +      PRE_MEM_WRITE("listmount(mnt_ids)", ARG2, ARG3 * sizeof(vki_uint64_t));
> > +   }
> 
> As with statmount, are ARG1 and ARG2 allowed to be NULL?
> If not then just always do the check.

Understood, fixed.
 
> > +}
> > +
> > +POST(sys_listmount)
> > +{
> > +   if ((ARG2 != 0) && (ARG3 > 0)) {
> > +      POST_MEM_WRITE(ARG2, ARG3 * sizeof(vki_uint64_t));
> > +   }
> > +}
> 
> Likewise.

Likewise in a sense that NULL value of ARG2 needs to be checked?  Not sure I'm getting this right.  Isn't this the case that NULL value SEGVs whole thing, and then POST doesn't apply (because POST is /by default/ only exec'd for successful syscalls) ?

> And I think it should be POST_MEM_WRITE(ARG2, RES * sizeof(vki_uint64_t));
> ARG3 is the maximum, RES is the actual number of mnt_ids put in.
> 
> > +struct vki_mnt_id_req {
> > +   __vki_u32 size;
> > +   __vki_u32 spare;
> > +   __vki_u64 mnt_id;
> > +   __vki_u64 param;
> > +   __vki_u64 mnt_ns_id;
> > +};
> 
> O. I see what that size field is for. There are multiple versions of
> this struct. The second variant is bigger than the first (has the
> mnt_ns_id field added). See
> https://lore.kernel.org/all/49930bdce29a8367a213eb14c1e68e7e49284f86.
> 1719243756.git.josef@toxicpanda.com/
> 
> So where we do
> PRE_MEM_READ("statmount(req)", ARG1, sizeof(struct vki_mnt_id_req));
> or
> PRE_MEM_READ("listmount(req)", ARG1, sizeof(struct vki_mnt_id_req));
> 
> We really should check just the struct size from user space:
> 
> struct vki_mnt_id_req *req = (struct vki_mnt_id_req *)(Addr)ARG1;
> PRE_MEM_READ("statmount(req)", ARG1, req->size);

aha, fixed!

> > +struct vki_statmount {
> > +     __vki_u32 size;         /* Total size, including strings */
> > +     __vki_u32 mnt_opts;             /* [str] Mount options of the mount */
> > +     __vki_u64 mask;         /* What results were written */
> > +     __vki_u32 sb_dev_major; /* Device ID */
> > +     __vki_u32 sb_dev_minor;
> > +     __vki_u64 sb_magic;             /* ..._SUPER_MAGIC */
> > +     __vki_u32 sb_flags;             /* SB_{RDONLY,SYNCHRONOUS,DIRSYNC,LAZYTIME} */
> > +     __vki_u32 fs_type;              /* [str] Filesystem type */
> > +     __vki_u64 mnt_id;               /* Unique ID of mount */
> > +     __vki_u64 mnt_parent_id;        /* Unique ID of parent (for root == mnt_id) */
> > +     __vki_u32 mnt_id_old;   /* Reused IDs used in proc/.../mountinfo */
> > +     __vki_u32 mnt_parent_id_old;
> > +     __vki_u64 mnt_attr;             /* MOUNT_ATTR_... */
> > +     __vki_u64 mnt_propagation;      /* MS_{SHARED,SLAVE,PRIVATE,UNBINDABLE} */
> > +     __vki_u64 mnt_peer_group;       /* ID of shared peer group */
> > +     __vki_u64 mnt_master;   /* Mount receives propagation from this ID */
> > +     __vki_u64 propagate_from;       /* Propagation from in current namespace */
> > +     __vki_u32 mnt_root;             /* [str] Root of mount relative to root of fs */
> > +     __vki_u32 mnt_point;    /* [str] Mountpoint relative to current root */
> > +     __vki_u64 mnt_ns_id;    /* ID of the mount namespace */
> > +     __vki_u64 __spare2[49];
> > +     char str[];             /* Variable size part containing strings */
> > +};
> 
> OK.
> 
> > diff --git a/include/vki/vki-scnums-shared-linux.h b/include/vki/vki-scnums-shared-linux.h
> > index 32ef8ac13..85bf9d171 100644
> > --- a/include/vki/vki-scnums-shared-linux.h
> > +++ b/include/vki/vki-scnums-shared-linux.h
> > @@ -56,6 +56,8 @@
>  
> >  #define __NR_cachestat               451
> >  #define __NR_fchmodat2               452
> > +#define __NR_statmount          457
> > +#define __NR_listmount          458
> >  #define __NR_mseal           462
> 
> Looks good.

Thanks a ton, Mark, for your review.  Please check my updated patch.
Comment 8 Mark Wielaard 2025-06-27 16:16:49 UTC
Looks good. But ...

+PRE(sys_statmount)
+{
+   // int syscall(SYS_statmount, struct mnt_id_req *req,
+   //             struct statmount *smbuf, size_t bufsize,
+   //             unsigned long flags);
+   FUSE_COMPATIBLE_MAY_BLOCK();
+   PRINT("sys_statmount ( %#" FMT_REGWORD "x, %#" FMT_REGWORD "x, %lu,  %#" FMT_REGWORD "x)", ARG1, ARG2, ARG3, ARG4);
+   PRE_REG_READ4(long, "statmount", struct vki_mnt_id_req *, req, struct vki_statmount *, smbuf, vki_size_t, bufsize, unsigned long, flags);
+   if (ARG1 == 0) {
+      if (VG_(clo_verbosity) >= 1) {
+         VG_(message)(Vg_UserMsg, "Warning: client syscall sys_statmount: ARG1 is NULL");
+      }
+   } else {
+      struct vki_mnt_id_req *req = (struct vki_mnt_id_req *)(Addr)ARG1;
+      PRE_MEM_READ("statmount(req)", ARG1, req->size);
+   }

I didn't caught this in the previous review, sorry.
But since you are going to dereference req->size the ARG1 check really should be something like:
struct vki_mnt_id_req *req = (struct vki_mnt_id_req *)(Addr)ARG1;
if (!ML_(safe_to_deref) ((void *)&req->size, sizeof(vki_size_t))) ...

And although I said that if it was NULL (or not safe_to_deref) I would just want to see a warning I meant from PRE_MEM_READ not really a separate one. So I would write it as:

struct vki_mnt_id_req *req = (struct vki_mnt_id_req *)(Addr)ARG1;
if (!ML_(safe_to_deref) ((void *)&req->size, sizeof(vki_size_t)))
   PRE_MEM_READ("statmount(req)", ARG1, sizeof(struct vki_mnt_id_req)); // This will yell and scream because we already know req->size isn't valid.
else
   PRE_MEM_READ("statmount(req)", ARG1, req->size);

+   if (ARG2 == 0) {
+      if (VG_(clo_verbosity) >= 1) {
+         VG_(message)(Vg_UserMsg, "Warning: client syscall sys_statmount: ARG2 is NULL");
+      }
+   } else if (ARG3 > 0) {
+      PRE_MEM_WRITE("statmount(smbuf)", ARG2, ARG3);
+   }

Same here.

+PRE(sys_listmount)
+{
+   // int syscall(SYS_listmount, struct mnt_id_req *req,
+   //             uint64_t *mnt_ids, size_t nr_mnt_ids,
+   //             unsigned long flags);
+   FUSE_COMPATIBLE_MAY_BLOCK();
+   PRINT("sys_listmount ( %#" FMT_REGWORD "x, %#" FMT_REGWORD "x, %lu,  %#" FMT_REGWORD "x)", ARG1, ARG2, ARG3, ARG4);
+   PRE_REG_READ4(long, "listmount", struct vki_mnt_id_req *, req, vki_uint64_t *, mnt_ids, vki_size_t, nr_mnt_ids, unsigned long, flags);
+   if (ARG1 == 0) {
+      if (VG_(clo_verbosity) >= 1) {
+         VG_(message)(Vg_UserMsg, "Warning: client syscall sys_listmount: ARG1 is NULL");
+      }
+   } else {
+      struct vki_mnt_id_req *req = (struct vki_mnt_id_req *)(Addr)ARG1;
+      PRE_MEM_READ("listmount(req)", ARG1, req->size);
+   }

And here.

+   if (ARG2 == 0) {
+      if (VG_(clo_verbosity) >= 1) {
+         VG_(message)(Vg_UserMsg, "Warning: client syscall sys_listmount: ARG2 is NULL");
+      }
+   }
+   else if (ARG3 > 0) {
+      PRE_MEM_WRITE("listmount(mnt_ids)", ARG2, ARG3 * sizeof(vki_uint64_t));
+   }

And I think this can just be an unconditional:
   PRE_MEM_WRITE("listmount(mnt_ids)", ARG2, ARG3 * sizeof(vki_uint64_t));
Comment 9 mcermak 2025-06-27 20:43:26 UTC
Created attachment 182744 [details]
proposed patch

Hello Mark, thank you for your review.  The attached patch tries to address your comments.  Is it correct?
Comment 10 Mark Wielaard 2025-06-27 21:21:48 UTC
(In reply to mcermak from comment #9)
> Created attachment 182744 [details]
> proposed patch
> 
> Hello Mark, thank you for your review.  The attached patch tries to address
> your comments.  Is it correct?

Very nice. With the ltp listmount4 we now get:

==317967== Syscall param listmount(req) points to unaddressable byte(s)
==317967==    at 0x4961A8D: syscall (syscall.S:38)
==317967==    by 0x4012FB: run (listmount04.c:130)
==317967==    by 0x402CBC: run_tests (tst_test.c:1566)
==317967==    by 0x402CBC: testrun (tst_test.c:1632)
==317967==    by 0x402CBC: fork_testrun (tst_test.c:1771)
==317967==    by 0x404E9F: tst_run_tcases (tst_test.c:1918)
==317967==    by 0x40117D: main (tst_test.h:725)
==317967==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==317967== 
listmount04.c:130: TPASS: request points to unaccessible memory : EFAULT (14)
==317967== Syscall param listmount(mnt_ids) points to unaddressable byte(s)
==317967==    at 0x4961A8D: syscall (syscall.S:38)
==317967==    by 0x4012FB: run (listmount04.c:130)
==317967==    by 0x402CBC: run_tests (tst_test.c:1566)
==317967==    by 0x402CBC: testrun (tst_test.c:1632)
==317967==    by 0x402CBC: fork_testrun (tst_test.c:1771)
==317967==    by 0x404E9F: tst_run_tcases (tst_test.c:1918)
==317967==    by 0x40117D: main (tst_test.h:725)
==317967==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==317967== 
listmount04.c:130: TPASS: mnt_ids points to unaccessible memory : EFAULT (14)

Pushed as:

commit 57152acfc6a82baa58f0d2fa0409290278bafde7 (HEAD -> master, origin/master, origin/HEAD)
Author: Martin Cermak <mcermak@redhat.com>
Date:   Fri Jun 27 22:36:03 2025 +0200

    Wrap linux specific syscalls 457 (listmount) and 458 (statmount)
    
    The listmount syscall returns a list of mount IDs under the req.mnt_id.
    This is meant to be used in conjunction with statmount(2) in order to
    provide a way to iterate and discover mounted file systems.
    
    The statmount syscall returns information about a mount, storing it in
    the buffer pointed to by smbuf.  The returned buffer is a struct
    statmount which is of size bufsize.
    
    Declare a sys_{lis,sta}tmount wrapper in priv_syswrap-linux.h and hook it
    for {amd64,arm,arm64,mips64,nanomips,ppc32,ppc64,riscv64,s390x,x86}-linux
    using LINXY with PRE and POST handler in syswrap-linux.c
    
    Both syscalls need CAP_SYS_ADMIN, to successfully test.
    
    Resolves: https://bugs.kde.org/show_bug.cgi?id=502968