Bug 492422 - Please support DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD
Summary: Please support DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.23.0
Platform: Fedora RPMs Linux
: NOR normal
Target Milestone: ---
Assignee: Mark Wielaard
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-08-30 20:18 UTC by Michael Catanzaro
Modified: 2024-11-27 13:10 UTC (History)
5 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Quick and dirty record_fd_open for DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD (1.63 KB, text/plain)
2024-08-30 21:36 UTC, Mark Wielaard
Details
quick and dirty fd handling for ioctls drm_ioctl_suncobj_handle_to_fd, DRM_IOCTL_PRIME_FD_TO_HANDLE and SYNC_IOC_MERGE (2.39 KB, text/plain)
2024-08-30 22:52 UTC, Mark Wielaard
Details
Fixup of sync_file defines plus fd tracking for 3 ioctls (4.42 KB, text/plain)
2024-08-31 00:54 UTC, Mark Wielaard
Details
Fixup of sync_file defines plus fd tracking for 3 ioctls (4.42 KB, text/plain)
2024-08-31 15:16 UTC, Mark Wielaard
Details
Implement stable variant of sync_file ioctls (4.99 KB, text/plain)
2024-08-31 17:51 UTC, Mark Wielaard
Details
Combined valgrind+strace log (1.07 MB, text/plain)
2024-09-13 20:59 UTC, nilskemail+kde
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Catanzaro 2024-08-30 20:18:10 UTC
SUMMARY

It's hard to use valgrind to debug file descriptor errors in GTK applications because valgrind doesn't understand that DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD may allocate a file descriptor.

STEPS TO REPRODUCE
1. GSK_RENDERER=vulkan valgrind --tool=none --track-fds=yes gnome-calculator

OBSERVED RESULT

valgrind will print some actual fd bugs in GTK, but it will be very hard to tell what is a real bug and what is not, because valgrind will also print a bunch of false positive complaints that look like this:

==34793== File descriptor 27: /dmabuf: is already closed
==34793==    at 0x63C7F9C: close (close.c:27)
==34793==    by 0x397D1542: wsi_create_sync_for_image_syncobj (wsi_common_drm.c:410)
==34793==    by 0x397CEE67: wsi_common_acquire_next_image2 (wsi_common.c:1205)
==34793==    by 0x397CEC92: wsi_AcquireNextImageKHR (wsi_common.c:1186)
==34793==    by 0x4EEA265: gdk_vulkan_context_begin_frame (gdkvulkancontext.c:652)
==34793==    by 0x4EBAA46: gdk_draw_context_begin_frame_full (gdkdrawcontext.c:399)
==34793==    by 0x4F9D169: gsk_gpu_frame_default_begin (gskgpuframe.c:87)
==34793==    by 0x4FB8B4F: gsk_vulkan_frame_begin (gskvulkanframe.c:160)
==34793==    by 0x4F9D4C8: gsk_gpu_frame_begin (gskgpuframe.c:209)
==34793==    by 0x4FB0AC1: gsk_gpu_renderer_render (gskgpurenderer.c:446)
==34793==    by 0x4F0CC24: gsk_renderer_render (gskrenderer.c:495)
==34793==    by 0x4BCC5C4: gtk_widget_render (gtkwidget.c:12012)

It will also print the backtrace to where the fd was previously allocated and previously closed, but this is irrelevant. wsi_create_sync_for_image_syncobj() is getting the fds via the DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD ioctl before closing them, so they're coming straight from the kernel. valgrind just doesn't know about this.

EXPECTED RESULT

No bogus complaints

SOFTWARE/OS VERSIONS

Fedora 40: gtk4-4.14.4-2.fc40, valgrind-1:3.23.0-4.fc40

ADDITIONAL INFORMATION

I've figured out that we need to add VKI_DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD and struct vki_drm_syncobj_handle in vki-linux-drm.h, and then add ML_(record_fd_open_nameless) somewhere in syswrap-linux.c. But that looks a little complicated.
Comment 1 Michael Catanzaro 2024-08-30 20:30:28 UTC
Unfortunately Benjamin thinks we'll need a lot more than one new ioctl for `valgrind --track-fds=yes` to be useful for GTK applications, including at least  SYNC_IOC_MERGE and DRM_IOCTL_PRIME_HANDLE_TO_FD.

I also forgot to include a link to my GTK issue report: https://gitlab.gnome.org/GNOME/gtk/-/issues/6969
Comment 2 Mark Wielaard 2024-08-30 21:36:40 UTC
Created attachment 173142 [details]
Quick and dirty record_fd_open for DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD

Thanks for the report. I was unaware that there were ioctls that created fds. But of course, why not.
Attached is a quick and dirty patch to record the fd from DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD.

I'll take a look at those other ioctls. Is there a description of how the DRM_IOCTLs handle fds in general? Are there any that close/destroy an fd? Or are you always supposed to use the "normal" close syscall on them?
Comment 3 Mark Wielaard 2024-08-30 22:52:14 UTC
Created attachment 173148 [details]
quick and dirty fd handling for ioctls drm_ioctl_suncobj_handle_to_fd, DRM_IOCTL_PRIME_FD_TO_HANDLE and SYNC_IOC_MERGE

This should handle new fds from all three ioctls.
Not sure this is enough, but should produce a bit better --track-fds=yes results.
Could you test with this?
Comment 4 Michael Catanzaro 2024-08-30 23:07:12 UTC
(In reply to Mark Wielaard from comment #2)
> I'll take a look at those other ioctls. Is there a description of how the
> DRM_IOCTLs handle fds in general? Are there any that close/destroy an fd? Or
> are you always supposed to use the "normal" close syscall on them?

I don't know, but I'll ask graphics developers to answer this.
Comment 5 Michael Catanzaro 2024-08-30 23:27:50 UTC
(In reply to Mark Wielaard from comment #3)
> Could you test with this?

It fixes the issue I reported. Thank you!

But there is another one that looks almost identical:

==130394== File descriptor 28: file descriptor 28 is already closed
==130394==    at 0x60F9F9C: close (close.c:27)
==130394==    by 0x4339341C: wsi_create_sync_for_image_syncobj (wsi_common_drm.c:413)
==130394==    by 0x43390E67: wsi_common_acquire_next_image2 (wsi_common.c:1205)
==130394==    by 0x43390C92: wsi_AcquireNextImageKHR (wsi_common.c:1186)
==130394==    by 0x50E0265: gdk_vulkan_context_begin_frame (gdkvulkancontext.c:652)
==130394==    by 0x50B0A46: gdk_draw_context_begin_frame_full (gdkdrawcontext.c:399)
==130394==    by 0x5193169: gsk_gpu_frame_default_begin (gskgpuframe.c:87)
==130394==    by 0x51AEB4F: gsk_vulkan_frame_begin (gskvulkanframe.c:160)
==130394==    by 0x51934C8: gsk_gpu_frame_begin (gskgpuframe.c:209)
==130394==    by 0x51A6AC1: gsk_gpu_renderer_render (gskgpurenderer.c:446)
==130394==    by 0x5102C24: gsk_renderer_render (gskrenderer.c:495)
==130394==    by 0x4DC25C4: gtk_widget_render (gtkwidget.c:12012)

The difference is the line number changed from wsi_common_drm.c:410 to wsi_common_drm.c:413. This time it's a different file descriptor coming from a different ioctl: SYNC_IOC_MERGE. (The line numbers are from mesa 24.1.6-1.fc40.) I found some documentation here though: https://github.com/torvalds/linux/blob/fb24560f31f9dff2c97707cfed6029bfebebaf1c/include/uapi/linux/sync_file.h#L19 and it looks like the second fd2 is opened.
Comment 6 Mark Wielaard 2024-08-31 00:13:07 UTC
Urgh, somehow the VKI_SYNC_IOC_MERGE macro in valgrind is broken/wrong.

If you add this patch on top things should work.
Scratching my head now what we did wrong and why we need to hardcode the hex value.

diff --git a/include/vki/vki-linux.h b/include/vki/vki-linux.h
index ccdb808af7a6..f10d9a2761e5 100644
--- a/include/vki/vki-linux.h
+++ b/include/vki/vki-linux.h
@@ -3884,8 +3884,8 @@ struct vki_sync_fence_info_data {
 #define VKI_SYNC_IOC_WAIT \
    _VKI_IOW(VKI_SYNC_IOC_MAGIC, 0, __vki_s32)
 
-#define VKI_SYNC_IOC_MERGE \
-   _VKI_IOWR(VKI_SYNC_IOC_MAGIC, 1, struct vki_sync_merge_data)
+#define VKI_SYNC_IOC_MERGE 0xc0303e03
+//    _VKI_IOWR(VKI_SYNC_IOC_MAGIC, 1, struct vki_sync_merge_data)
 
 #define VKI_SYNC_IOC_FENCE_INFO \
    _VKI_IOWR(VKI_SYNC_IOC_MAGIC, 2, struct vki_sync_fence_info_data)
Comment 7 Mark Wielaard 2024-08-31 00:35:55 UTC
(In reply to Mark Wielaard from comment #6)
> Urgh, somehow the VKI_SYNC_IOC_MERGE macro in valgrind is broken/wrong.

O no. We got the constant and data structure definitions for sync_file from some ancient android linux kernel.
Turns out that when merged into the upstream linux kernel the ABI changed...

include/uapi/linux/sync_file.h says:

/*
 * Opcodes  0, 1 and 2 were burned during a API change to avoid users of the
 * old API to get weird errors when trying to handling sync_files. The API
 * change happened during the de-stage of the Sync Framework when there was
 * no upstream users available.
 */

Will update the valgrind defintions to match.
Comment 8 Mark Wielaard 2024-08-31 00:54:39 UTC
Created attachment 173151 [details]
Fixup of sync_file defines plus fd tracking for 3 ioctls

OK. Updated the sync_file definitions sothe SYNC_IOC_MERGE ioctl is now properly captured.
The patch only updates the fd tracking for the ioctls for now.

Could you test this updated patch to see if it handles file descriptor tracking for all three ioctls now?
Comment 9 Paul Floyd 2024-08-31 04:48:35 UTC
Double ugh. I doubt that any of these ioctls are portable to any other platforms. I need to do some checking.
Comment 10 Michael Catanzaro 2024-08-31 14:20:09 UTC
(In reply to Mark Wielaard from comment #8)
> Could you test this updated patch to see if it handles file descriptor
> tracking for all three ioctls now?

Yes. There is a massive improvement in the results I see.

There are only a few fd reuse errors remaining, and they all look like this:

==14900== File descriptor 17: /memfd:mutter-shared (deleted) is already closed
==14900==    at 0x60F9F9C: close (close.c:27)
==14900==    by 0x433901CD: wsi_destroy_image (wsi_common.c:789)
==14900==    by 0x4339B192: wsi_wl_swapchain_images_free (wsi_common_wayland.c:2238)
==14900==    by 0x4339B1F5: wsi_wl_swapchain_destroy (wsi_common_wayland.c:2306)
==14900==    by 0x50DF509: gdk_vulkan_context_dispose (gdkvulkancontext.c:370)
==14900==    by 0x49E9E7A: g_object_unref (gobject.c:4415)
==14900==    by 0x51A5F35: gsk_gpu_renderer_unrealize (gskgpurenderer.c:247)
==14900==    by 0x51409FC: gsk_vulkan_renderer_unrealize (gskvulkanrenderer.c:142)
==14900==    by 0x5102635: gsk_renderer_unrealize (gskrenderer.c:380)
==14900==    by 0x4DD12A7: gtk_window_unrealize (gtkwindow.c:4407)
==14900==    by 0x4B2AEB3: gtk_application_window_real_unrealize (gtkapplicationwindow.c:528)
==14900==    by 0x49DEE79: g_cclosure_marshal_VOID__VOID (gmarshal.c:117)

I suspect we're just missing one last ioctl that creates fds. I will investigate.

And then there are several fd leaks reported. Some of them are surely real, but some look like we're perhaps missing a DRM ioctl that closes fds? Probably best for me to investigate and create a separate bug report if valgrind turns out to be to blame.
Comment 11 Michael Catanzaro 2024-08-31 14:58:09 UTC
(In reply to Michael Catanzaro from comment #10)
> I suspect we're just missing one last ioctl that creates fds. I will
> investigate.

I believe it's coming from DRM_IOCTL_PRIME_HANDLE_TO_FD. (It's coming from driver-specific code, unfortunately. I only investigated the amdgpu code.)
Comment 12 Mark Wielaard 2024-08-31 15:16:49 UTC
Created attachment 173172 [details]
Fixup of sync_file defines plus fd tracking for 3 ioctls

(In reply to Michael Catanzaro from comment #11)
> (In reply to Michael Catanzaro from comment #10)
> > I suspect we're just missing one last ioctl that creates fds. I will
> > investigate.
> 
> I believe it's coming from DRM_IOCTL_PRIME_HANDLE_TO_FD. (It's coming from
> driver-specific code, unfortunately. I only investigated the amdgpu code.)

Groan. That is because I mixed up DRM_IOCTL_PRIME_HANDLE_TO_FD and DRM_IOCTL_PRIME_FD_TO_HANDLE.

This patch fixes that mixup.

But I really should write a clean patch (or two).
Comment 13 Michael Catanzaro 2024-08-31 16:35:57 UTC
I am only just now realizing that both of the ioctls that I mentioned as follow-ups, SYNC_IOC_MERGE and DRM_IOCTL_PRIME_HANDLE_TO_FD, are also the ones that I mentioned in comment #1 and which you had already attempted to support. Well, this is why testing is important. :)
Comment 14 Michael Catanzaro 2024-08-31 16:39:08 UTC
Good news: with your latest patch, there are no more fd errors other than fd leaks. If I determine that any of these look like valgrind issues, then I'll report a follow-up bug.
Comment 15 Mark Wielaard 2024-08-31 17:51:53 UTC
Created attachment 173179 [details]
Implement stable variant of sync_file ioctls

We implemented an old staging android variant of the sync_file
ioctls. But the data structures and ioctl numbers changed when these
were upstreamed in the table linux kernel.

This implements the SYNC_IOC_MERGE, SYNC_IOC_FILE_INFO and
SYNC_IOC_SET_DEADLINE ioctls. And makes sure to record the new file
descriptor created by SYNC_IOC_MERGE.
Comment 16 Michael Catanzaro 2024-09-09 14:47:18 UTC
Are we waiting for code review now?
Comment 17 Mark Wielaard 2024-09-09 20:45:09 UTC
Sorry, we are waiting for me to cleanup the DRM_IOCTL related ioctls.
I did the SYNC_IOC (sync_file) ioctls cleanly in the attachment to comment #15.
Just need to go over the DRM_IOCTL ioctrls to make sure we handle them all properly.
Comment 18 nilskemail+kde 2024-09-13 20:01:46 UTC
I noticed a few other cases of valgrind reporting a duplicate close of dmabuf fds:

==727618== File descriptor 30: /dmabuf:727694-WPEWebProcess is already closed
==727618==    at 0xA859F9C: close (close.c:27)
==727618==    by 0x54063871: create_wl_buffer (platform_wayland.c:1422)
==727618==    by 0x54063A9A: dri2_wl_create_wayland_buffer_from_image (platform_wayland.c:1634)
==727618==    by 0x54056736: dri2_create_wayland_buffer_from_image (egl_dri2.c:2413)
==727618==    by 0x5404F660: eglCreateWaylandBufferFromImageWL (eglapi.c:2395)
==727618==    by 0x4856B97: on_export_wl_egl_image (cog-platform-wl.c:1708)
==727618==    by 0xC885055: ffi_call_unix64 (unix64.S:104)
==727618==    by 0xC88169F: ffi_call_int.lto_priv.0 (ffi64.c:673)
==727618==    by 0xC8844ED: ffi_call (ffi64.c:710)
==727618==    by 0xFCF1B22: wl_closure_invoke.constprop.0 (connection.c:1228)
==727618==    by 0xFCF6831: wl_client_connection_data (wayland-server.c:444)
==727618==    by 0xFCF4C91: wl_event_loop_dispatch (event-loop.c:1105)

This is running with mesa 24.1.7 so the accompanying code where it gets closed is https://gitlab.freedesktop.org/mesa/mesa/-/blob/mesa-24.1.7/src/egl/drivers/dri2/platform_wayland.c?ref_type=tags#L1422 though I am not as experienced as Michael so I could not track down where they get allocated.
Comment 19 Mark Wielaard 2024-09-13 20:16:14 UTC
(In reply to nilskemail+kde from comment #18)
> I noticed a few other cases of valgrind reporting a duplicate close of
> dmabuf fds:
> 
> ==727618== File descriptor 30: /dmabuf:727694-WPEWebProcess is already closed
> ==727618==    at 0xA859F9C: close (close.c:27)
> ==727618==    by 0x54063871: create_wl_buffer (platform_wayland.c:1422)
> ==727618==    by 0x54063A9A: dri2_wl_create_wayland_buffer_from_image
> (platform_wayland.c:1634)
> ==727618==    by 0x54056736: dri2_create_wayland_buffer_from_image
> (egl_dri2.c:2413)
> ==727618==    by 0x5404F660: eglCreateWaylandBufferFromImageWL
> (eglapi.c:2395)
> ==727618==    by 0x4856B97: on_export_wl_egl_image (cog-platform-wl.c:1708)
> ==727618==    by 0xC885055: ffi_call_unix64 (unix64.S:104)
> ==727618==    by 0xC88169F: ffi_call_int.lto_priv.0 (ffi64.c:673)
> ==727618==    by 0xC8844ED: ffi_call (ffi64.c:710)
> ==727618==    by 0xFCF1B22: wl_closure_invoke.constprop.0 (connection.c:1228)
> ==727618==    by 0xFCF6831: wl_client_connection_data (wayland-server.c:444)
> ==727618==    by 0xFCF4C91: wl_event_loop_dispatch (event-loop.c:1105)
> 
> This is running with mesa 24.1.7 so the accompanying code where it gets
> closed is
> https://gitlab.freedesktop.org/mesa/mesa/-/blob/mesa-24.1.7/src/egl/drivers/
> dri2/platform_wayland.c?ref_type=tags#L1422 though I am not as experienced
> as Michael so I could not track down where they get allocated.

I think this comes from DMA_BUF_IOCTL_EXPORT_SYNC_FILE you could check with strace if your program calls that ioctl.
valgrind doesn't seem to track any of the DMA_BUF_IOCTLs at the moment. Maybe we should make that a separate bug.
Comment 20 nilskemail+kde 2024-09-13 20:41:44 UTC
Some others are

==731594== File descriptor 101: /dmabuf:731594-WPEWebProcess is already closed
==731594==    at 0xAE52F9C: close (close.c:27)
==731594==    by 0xBB6358E: gst_fd_mem_free (gstfdmemory.c:71)
==731594==    by 0xA90DDC4: _gst_memory_free (gstmemory.c:98)
==731594==    by 0xE3F1797D: gst_va_memory_pool_flush_unlocked (gstmemory.h:341)
==731594==    by 0xE3F17B91: gst_va_dmabuf_allocator_flush (gstvaallocator.c:208)
==731594==    by 0xE3F17C7B: gst_va_pool_stop (gstvapool.c:354)
==731594==    by 0xA8CEDCC: do_stop (gstbufferpool.c:440)
==731594==    by 0xA8D70CF: gst_buffer_pool_release_buffer (gstbufferpool.c:1207)
==731594==    by 0xA8D7117: _gst_buffer_dispose (gstbuffer.c:789)
==731594==    by 0xA90D033: gst_mini_object_unref (gstminiobject.c:669)
==731594==    by 0xA937C09: _gst_sample_free (gstsample.c:87)
==731594==    by 0x8644829: WebCore::MediaPlayerPrivateGStreamer::~MediaPlayerPrivateGStreamer() (gstsample.h:102)

and

==731594== File descriptor 225: /home/septatrix/.cache/mesa_shader_cache/05/8d39eff88a3d58e9e8462bc10880da9d2cd734 is already closed
==731594==    at 0xAE52F9C: close (close.c:27)
==731594==    by 0xE3F19E69: gst_va_dmabuf_get_modifier_for_format (gstvaallocator.c:552)
==731594==    by 0xC7FB04C4: gst_va_create_dma_caps (gstvacaps.c:235)
==731594==    by 0xC7FB0B31: gst_va_create_raw_caps_from_config (gstvacaps.c:376)
==731594==    by 0xC7F9C48D: gst_va_base_dec_src_query (gstvadecoder.c:419)
==731594==    by 0xA916458: gst_pad_query (gstpad.c:4228)
==731594==    by 0xA960AEB: gst_pad_query_caps (gstutils.c:3118)
==731594==    by 0xA913015: gst_pad_get_allowed_caps (gstpad.c:2855)
==731594==    by 0xC7FA4B1D: gst_va_base_dec_get_preferred_format_and_caps_features (gstvabasedec.c:869)
==731594==    by 0xC7FA58B4: gst_va_base_dec_set_output_state (gstvabasedec.c:1139)
==731594==    by 0xC7FA5C03: gst_va_base_dec_negotiate (gstvabasedec.c:694)
==731594==    by 0xBD5702D: gst_video_decoder_negotiate (gstvideodecoder.c:4549)

and

==731594== File descriptor 24: file descriptor 24 is already closed
==731594==    at 0xAE52F9C: close (close.c:27)
==731594==    by 0x52FE2871: create_wl_buffer (platform_wayland.c:1422)
==731594==    by 0x52FE2E4B: dri2_wl_swap_buffers_with_damage (platform_wayland.c:1540)
==731594==    by 0x52FD7109: dri2_swap_buffers (egl_dri2.c:1886)
==731594==    by 0x52FCAAB7: eglSwapBuffers (eglapi.c:1447)
==731594==    by 0x513FC36: WTF::Detail::CallableWrapper<WebKit::ThreadedCompositor::ThreadedCompositor(WebKit::ThreadedCompositor::Client&, unsigned int, WebCore::IntSize const&, float, bool)::$_0, void>::call() (Source/WebCore/platform/graphics/egl/GLContext.cpp:466)
==731594==    by 0x69C7BE4: WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::$_0::__invoke(void*) (Source/WTF/wtf/glib/RunLoopGLib.cpp:177)
==731594==    by 0x69C6CF0: WTF::RunLoop::$_0::__invoke(_GSource*, int (*)(void*), void*) (Source/WTF/wtf/glib/RunLoopGLib.cpp:53)
==731594==    by 0xA674E8B: g_main_context_dispatch_unlocked.lto_priv.0 (gmain.c:3344)
==731594==    by 0xA6D6C97: g_main_context_iterate_unlocked.isra.0 (gmain.c:4217)
==731594==    by 0xA67AF36: g_main_loop_run (gmain.c:4419)
==731594==    by 0x69C7688: WTF::RunLoop::run() (Source/WTF/wtf/glib/RunLoopGLib.cpp:108)
Comment 21 nilskemail+kde 2024-09-13 20:58:02 UTC
(In reply to Mark Wielaard from comment #19)
> (In reply to nilskemail+kde from comment #18)
> > I noticed a few other cases of valgrind reporting a duplicate close of
> > dmabuf fds:
> >
> > ... 
> >
> I think this comes from DMA_BUF_IOCTL_EXPORT_SYNC_FILE you could check with
> strace if your program calls that ioctl.
> valgrind doesn't seem to track any of the DMA_BUF_IOCTLs at the moment.
> Maybe we should make that a separate bug.

No, strace does not report that anywhere. I will attach the log of a combined strace+valgrind run
Comment 22 nilskemail+kde 2024-09-13 20:59:01 UTC
Created attachment 173642 [details]
Combined valgrind+strace log
Comment 23 nilskemail+kde 2024-09-13 21:02:09 UTC
Quick note, not that I am just dumb: I only applied your most recent patch file "Implement stable variant of sync_file ioctls". Do I also need to apply "Fixup of sync_file defines plus fd tracking for 3 ioctls" additionally?
Comment 24 Mark Wielaard 2024-09-14 10:58:04 UTC
(In reply to nilskemail+kde from comment #23)
> Quick note, not that I am just dumb: I only applied your most recent patch
> file "Implement stable variant of sync_file ioctls". Do I also need to apply
> "Fixup of sync_file defines plus fd tracking for 3 ioctls" additionally?

Yeah, sorry. The " Implement stable variant of sync_file ioctls" patch only implements the SYNC_IOC_* ioctls.
The "Fixup of sync_file defines plus fd tracking for 3 ioctls" does both a hacky SYNC_IOC_MERGE implementation and two DRM_IOCTLs.
I still should do a proper patch for the DRM_IOCTLs.
Comment 25 Paul Floyd 2024-09-14 11:13:34 UTC
I’ll also need this ioctl for FreeBSD.
Comment 26 nilskemail+kde 2024-09-14 11:49:44 UTC
(In reply to Mark Wielaard from comment #24)
> (In reply to nilskemail+kde from comment #23)
> > Quick note, not that I am just dumb: I only applied your most recent patch
> > file "Implement stable variant of sync_file ioctls". Do I also need to apply
> > "Fixup of sync_file defines plus fd tracking for 3 ioctls" additionally?
> 
> Yeah, sorry. The " Implement stable variant of sync_file ioctls" patch only
> implements the SYNC_IOC_* ioctls.
> The "Fixup of sync_file defines plus fd tracking for 3 ioctls" does both a
> hacky SYNC_IOC_MERGE implementation and two DRM_IOCTLs.
> I still should do a proper patch for the DRM_IOCTLs.

Okay, applying both patches (with some guesswork what to chose for the merge conflicts) fixed my issues and I no longer get "already closed" errors. Sorry for the confusion.
Comment 27 Paul Floyd 2024-09-14 19:17:28 UTC
(In reply to Paul Floyd from comment #25)
> I’ll also need this ioctl for FreeBSD.

There's no DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD but there is DRM_IOCTL_PRIME_HANDLE_TO_FD.
Comment 28 Mark Wielaard 2024-10-05 12:47:00 UTC
I pushed the first part:

commit eee8dd6461adb46e560f3e3981a069a6895cc241 (HEAD -> master)
Author: Mark Wielaard <mark@klomp.org>
Date:   Sat Aug 31 19:47:27 2024 +0200

    Implement stable variant of sync_file ioctls
    
    We implemented an old staging android variant of the sync_file
    ioctls. But the data structures and ioctl numbers changed when these
    were upstreamed in the table linux kernel.
    
    This implements the SYNC_IOC_MERGE, SYNC_IOC_FILE_INFO and
    SYNC_IOC_SET_DEADLINE ioctls. And makes sure to record the new file
    descriptor created by SYNC_IOC_MERGE.
    
    https://bugs.kde.org/show_bug.cgi?id=492422
Comment 29 Mark Wielaard 2024-10-05 22:09:08 UTC
There are more than 100 DRM_IOCTLs, or more than 400 when including all the driver specific variants.

There are only a handful of (generic) DRM_IOCTLs dealing with file descriptors:
DRM_IOCTL_MODE_CREATE_LEASE *
DRM_IOCTL_PRIME_HANDLE_TO_FD *
DRM_IOCTL_PRIME_FD_TO_HANDLE
DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD *
DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE
DRM_IOCTL_SYNCOBJ_EVENTFD

* Only these 3 create a new fd

Then there are DMA_BUF_IOCTL_EXPORT_SYNC_FILE and DMA_BUF_IOCTL_IMPORT_SYNC_FILE
Comment 30 Mark Wielaard 2024-11-01 03:34:12 UTC
commit 6ec6e12678273e3b09ddf9ae07d11d4ac0c91d7a
Author: Mark Wielaard <mark@klomp.org>
Date:   Fri Nov 1 04:26:45 2024 +0100

    DRM_IOCTLs SYNCOBJ_HANDLE_TO_FD, PRIME_HANDLE_TO_FD and MODE_CREATE_LEASE
    
    These three DRM_IOCTLs create new file descriptors, so track them using
    ML_(record_fd_open_nameless).
    
    https://bugs.kde.org/show_bug.cgi?id=492422
Comment 31 Michael Catanzaro 2024-11-26 23:30:15 UTC
Is there more needed here?
Comment 32 Mark Wielaard 2024-11-26 23:39:18 UTC
(In reply to Michael Catanzaro from comment #31)
> Is there more needed here?

It is kind of hard to say. As comment #29 says there are hundreds of drm related ioctl. I think we now cover the most used ones that handle file descriptors. But we are still missing at least DMA_BUF_IOCTL_EXPORT_SYNC_FILE and DMA_BUF_IOCTL_IMPORT_SYNC_FILE.

Maybe the best way to answer "if more is needed? " is to ask the question back :)
Have you tried --track-fds=yes on your workload with valgrind 3.24.0 (which was released 4 weeks ago)?
Does it still show "untracked" file descriptors?
Comment 33 Michael Catanzaro 2024-11-27 00:25:25 UTC
The issues I was noticing are all fixed. Nice, thanks!

I guess we need to decide whether to close this bug now, or leave it open for DMA_BUF_IOCTL_EXPORT_SYNC_FILE and DMA_BUF_IOCTL_IMPORT_SYNC_FILE.
Comment 34 Mark Wielaard 2024-11-27 13:10:48 UTC
Lets close this bug as resolved.
I opened a new one for the mission ioctls.
https://bugs.kde.org/show_bug.cgi?id=496751