Bug 462346 - GNOME 43 application cause kwin_wayland to freeze for 25 seconds after interacting with header bar buttons or menus while focus tracking is turned on in the Zoom effect
Summary: GNOME 43 application cause kwin_wayland to freeze for 25 seconds after intera...
Status: RESOLVED DUPLICATE of bug 450940
Alias: None
Product: kwin
Classification: Plasma
Component: wayland-generic (other bugs)
Version First Reported In: git master
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-28 10:11 UTC by Oded Arbel
Modified: 2023-12-08 20:25 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments
Thread dump while waing for GNOME Console header bar menu to appear (25.16 KB, text/plain)
2022-11-28 10:12 UTC, Oded Arbel
Details
Thread dump while waing for GNOME Text Editor header bar close button click to show the "unsaved files" dialog. (26.01 KB, text/plain)
2022-11-28 10:13 UTC, Oded Arbel
Details
Output of dbus-monitor while waiting for the GNOME Console header bar menu to appear (10.33 KB, text/plain)
2022-11-28 10:20 UTC, Oded Arbel
Details
My kwinrc file (8.93 KB, text/plain)
2022-11-29 07:25 UTC, Oded Arbel
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Oded Arbel 2022-11-28 10:11:40 UTC
SUMMARY

When using GNOME 43 applications under kwin_wayland, every time the user clicks on a header bar button, or a header bar menu action, kwin freezes for about a minute - there's no mouse cursor movement, nor does it respond to any key - not even vt switching keys.

After about a minute, the action completes and everything continues normally, including the action that was requested by the mouse click. This was observed with GNOME Text Editor and GNOME Console.

- Kwin is installed from neon unstable - currently at 4:5.26.3+p22.04+tunstable+git20221124.0014-0
- GNOME applications where installed from DEB files built for Ubuntu kinetic - text editor is at 43.1-1ubuntu1, console is at 43.0-2

The same use case with kwin_x11 (currently at 4:5.26.3+p22.04+tunstable+git20221124.0014-0) works fine.

STEPS TO REPRODUCE
1. Start a Plasma Wayland session
2. Launch a GNOME 43 application (such as Text Editor or Console)
3. Use the mouse cursor to press the application's close button (or any other header bar button).

OBSERVED RESULT
The entire graphical session stops responding for about a minute - nothing happens on the display and no key strokes are accepted. When the session unfreezes, key strokes sent during the freeze are processed. Whatever behavior the application expected to trigger happens normally.

EXPECTED RESULT
The application should behave normally, i.e. the close button should immediately start the application's close behavior.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 
KDE Plasma Version: 5.26.80
KDE Frameworks Version: 5.101.0
Qt Version: 5.15.7

ADDITIONAL INFORMATION
I managed to get full thread back traces of kwin while it is frozen, and it seems to be stuck on waiting for a dbus message reply. I'll attach the back traces next.
Comment 1 Oded Arbel 2022-11-28 10:12:40 UTC
Created attachment 154102 [details]
Thread dump while waing for GNOME Console header bar menu to appear
Comment 2 Oded Arbel 2022-11-28 10:13:25 UTC
Created attachment 154103 [details]
Thread dump while waing for GNOME Text Editor header bar close button click to show the "unsaved files" dialog.
Comment 3 Oded Arbel 2022-11-28 10:20:37 UTC
Created attachment 154104 [details]
Output of dbus-monitor while waiting for the GNOME Console header bar menu to appear

This log is interleaved with stdout messages from the GNOME Console application (that was being launch from the console where dbus-monitor was in the background) and this can be identified by starting with "(kgx:21007)" (apparently, kgx is the executable file name for gnome-console 🤷)
Comment 4 Oded Arbel 2022-11-28 10:34:25 UTC
(In reply to Oded Arbel from comment #3)
> Created attachment 154104 [details]
> Output of dbus-monitor while waiting for the GNOME Console header bar menu
> to appear

Ok, I timed it again and the freeze is more like 25 seconds (it just feels like a minute). In the dbus monitor log I believe it happens between timestamp 1669630453 and 1669630479
Comment 5 Nicolas Fella 2022-11-29 01:36:47 UTC
25 seconds is exactly the timeout for a DBus call, which seems to be exactly what happens here

#7  0x00007f745ed9d656 in QDBusPendingCallPrivate::waitForFinished() (this=this@entry=0x5619ecfeed30) at ../../include/QtCore/../../src/corelib/kernel/qdeadlinetimer.h:68
#8  0x00007f745ed58f12 in QDBusConnectionPrivate::sendWithReply(QDBusMessage const&, int, int) (this=0x5619ebd2a330, message=<optimized out>, sendMode=1, timeout=<optimized out>) at ./src/dbus/qdbusintegrator.cpp:2088
#9  0x00007f745ed44a0b in QDBusConnection::call(QDBusMessage const&, QDBus::CallMode, int) const (this=this@entry=0x7ffcaddc5178, message=..., mode=mode@entry=QDBus::Block, timeout=timeout@entry=-1) at ./src/dbus/qdbusconnection.cpp:680
#10 0x00007f745eee2777 in QAccessibleClient::RegistryPrivate::supportedInterfaces(QAccessibleClient::AccessibleObject const&) const (object=..., this=0x5619ec7c5da0) at ./src/qaccessibilityclient/registry_p.cpp:855
#11 QAccessibleClient::RegistryPrivate::supportedInterfaces(QAccessibleClient::AccessibleObject const&) const (object=..., this=0x5619ec7c5da0) at ./src/qaccessibilityclient/registry_p.cpp:843
#12 QAccessibleClient::AccessibleObject::supportedInterfaces() const (this=this@entry=0x7ffcaddc5450) at ./src/qaccessibilityclient/accessibleobject.cpp:198
#13 0x00007f745eee52c4 in QAccessibleClient::AccessibleObject::focusPoint() const (this=0x7ffcaddc5450) at ./src/qaccessibilityclient/accessibleobject.cpp:302
#14 0x00005619eb259d64 in KWin::ZoomAccessibilityIntegration::slotFocusChanged(QAccessibleClient::AccessibleObject const&) (this=0x5619ec7c4e20, object=<optimized out>) at effects/zoom/./src/effects/zoom/accessibilityintegration.cpp:96
Comment 6 Nicolas Fella 2022-11-29 01:41:17 UTC
Can you please attach your ~/.config/kwinrc ?
Comment 7 Nicolas Fella 2022-11-29 02:00:08 UTC
Can reproduce with gnome-console when "Enable focus tracking" in the Zoom effect settings is enabled
Comment 8 Nicolas Fella 2022-11-29 02:05:02 UTC
Relevant DBus call is QDBusMessage message = QDBusMessage::createMethodCall(object.d->service, object.d->path, QLatin1String("org.a11y.atspi.Accessible"), QLatin1String("GetInterfaces"));

in https://invent.kde.org/libraries/libqaccessibilityclient/-/blob/master/src/qaccessibilityclient/registry_p.cpp#L851
Comment 9 Nicolas Fella 2022-11-29 02:14:23 UTC
The call is to a service like ":1.16" and path "/org/gnome/Console/a11y/8c7c8daa_c230_4323_b01b_a3a3afeafb2c"
Comment 10 Oded Arbel 2022-11-29 07:25:53 UTC
Created attachment 154128 [details]
My kwinrc file
Comment 11 Oded Arbel 2022-11-29 07:27:09 UTC
(In reply to Nicolas Fella from comment #7)
> Can reproduce with gnome-console when "Enable focus tracking" in the Zoom
> effect settings is enabled

Indeed, when I  turn off focus tracking in the Zoom desktop effect, the problem goes away.
Comment 12 Oded Arbel 2022-11-29 07:51:34 UTC
The service :1.16 on my system currently (I'm not sure if I haven't rebooted since the original report) is the gvfs daemon. It exposes paths like /org/gtk/vfs.

From looking that the interface that code is trying to access - org.a11y.atspi.Accessible - I can't find any service in my session that implements this. I am not a dbus expert, so I don't know how common it is for method calls to non-existing interfaces to block until timeout, but I think that kwin should not issue calls - on the main thread - that can block for 25 seconds...
Comment 13 Nicolas Fella 2022-11-30 13:52:57 UTC
:1.16 is an arbitrary id, it's going to belong to a different service for you and me. In particular it does seem to belong to the app in question. KWin makes lots of similar DBus calls, so it doesn't seem to be a problem of the interface not being available in general. If it was unavailable I think the call would simply fail instead of hanging.

I have a vague theory of what might be happening. If while KWin is waiting for the application to respond to the DBus call the app blockingly waits for KWin to do something that would explain the freeze. Since the problem is happening when clicking the menu button, perhaps the app is waiting for the menu to be shown?
Comment 14 Nicolas Fella 2022-11-30 14:00:29 UTC
backtrace of gnome-console while things are frozen:

#0  0x00007f7d61c0705f in __GI___poll (fds=fds@entry=0x7ffc8f6bfbb8, nfds=nfds@entry=1, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007f7d611d14d4 in wl_display_poll (display=0x5621bab11e20, events=1) at ../src/wayland-client.c:1878
#2  wl_display_dispatch_queue (queue=<optimized out>, display=<optimized out>) at ../src/wayland-client.c:1951
#3  wl_display_dispatch_queue (display=0x5621bab11e20, queue=0x5621baf85d50) at ../src/wayland-client.c:1924
#4  0x00007f7d611d177f in wl_display_roundtrip_queue (display=0x5621bab11e20, queue=0x5621baf85d50) at ../src/wayland-client.c:1370
#5  0x00007f7d40447918 in roundtrip (dri2_dpy=0x5621bb009830) at ../src/egl/drivers/dri2/platform_wayland.c:241
#6  dri2_wl_create_window_surface (disp=<optimized out>, conf=<optimized out>, native_window=0x5621bc4a3400, attrib_list=<optimized out>) at ../src/egl/drivers/dri2/platform_wayland.c:737
#7  0x00007f7d4042f340 in _eglCreateWindowSurfaceCommon (disp=0x5621bb009200, config=<optimized out>, native_window=0x5621bc4a3400, attrib_list=0x0) at ../src/egl/main/eglapi.c:994
#8  0x00007f7d623af27f in gdk_surface_ensure_egl_surface (high_depth=0, self=0x5621bc33a110) at ../gdk/gdksurface.c:1120
#9  gdk_gl_context_real_begin_frame (draw_context=0x7f7d14004500, prefers_high_depth=0, region=0x5621bad46130) at ../gdk/gdkglcontext.c:564
#10 0x00007f7d6241863d in gsk_gl_renderer_render (renderer=0x5621bc2bd540, root=0x7f7d28002410, update_area=0x5621bc4a3790) at ../gsk/gskrendernode.c:739
#11 0x00007f7d62400c5a in gsk_renderer_render (renderer=0x5621bc2bd540, root=0x7f7d28002410, region=<optimized out>) at ../gsk/gskrenderer.c:463
#12 0x00007f7d622c6884 in gtk_widget_render (widget=<optimized out>, surface=<optimized out>, region=0x5621bab4afd0) at ../gtk/gtkwidget.c:11720
#13 0x00007f7d621d3dd9 in surface_render.lto_priv () at ../gtk/gtkpopover.c:819
#14 0x00007f7d623893b8 in _gdk_marshal_BOOLEAN__BOXEDv (closure=0x5621bab52a50, return_value=0x7ffc8f6c0060, instance=<optimized out>, args=<optimized out>, marshal_data=<optimized out>, n_params=<optimized out>, param_types=0x5621bac142e0) at gdk/gdkmarshalers.c:130
#15 0x00007f7d6292d554 in _g_closure_invoke_va (param_types=<optimized out>, n_params=<optimized out>, args=0x7ffc8f6c0110, instance=<optimized out>, return_value=<optimized out>, closure=0x5621bab52a50) at ../gobject/gclosure.c:895
#16 g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7ffc8f6c0110) at ../gobject/gsignal.c:3456
#17 0x00007f7d6292d633 in g_signal_emit (instance=instance@entry=0x5621bc33a110, signal_id=<optimized out>, detail=detail@entry=0) at ../gobject/gsignal.c:3606
#18 0x00007f7d623b21e6 in gdk_surface_process_updates_internal (surface=0x5621bc33a110) at ../gdk/gdksurface.c:1334
#19 gdk_surface_paint_on_clock (clock=<optimized out>, data=0x5621bc33a110) at ../gdk/gdksurface.c:1422
#20 0x00007f7d6290ffc0 in g_closure_invoke (closure=0x5621baeabee0, return_value=0x0, n_param_values=1, param_values=0x7ffc8f6c03b0, invocation_hint=0x7ffc8f6c0330) at ../gobject/gclosure.c:832
#21 0x00007f7d6293dd86 in signal_emit_unlocked_R.isra.0 (node=node@entry=0x5621badfe120, detail=detail@entry=0, instance=instance@entry=0x5621baf671a0, emission_return=emission_return@entry=0x0, instance_and_params=instance_and_params@entry=0x7ffc8f6c03b0)
    at ../gobject/gsignal.c:3796
#22 0x00007f7d6292d41a in g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7ffc8f6c0550) at ../gobject/gsignal.c:3549
#23 0x00007f7d6292d633 in g_signal_emit (instance=instance@entry=0x5621baf671a0, signal_id=<optimized out>, detail=detail@entry=0) at ../gobject/gsignal.c:3606
#24 0x00007f7d623a3717 in _gdk_frame_clock_emit_paint (frame_clock=0x5621baf671a0) at ../gdk/gdkframeclock.c:708
#25 gdk_frame_clock_paint_idle (data=data@entry=0x5621baf671a0) at ../gdk/gdkframeclockidle.c:609
#26 0x00007f7d623a3a2e in gdk_frame_clock_flush_idle (data=data@entry=0x5621baf671a0) at ../gdk/gdkframeclockidle.c:378
#27 0x00007f7d628134c8 in g_timeout_dispatch (source=0x5621bc1b7f40, callback=0x7f7d623a39c0 <gdk_frame_clock_flush_idle>, user_data=0x5621baf671a0) at ../glib/gmain.c:5007
#28 0x00007f7d62812cbf in g_main_dispatch (context=0x5621baacbc60) at ../glib/gmain.c:3444
#29 g_main_context_dispatch (context=0x5621baacbc60) at ../glib/gmain.c:4162
#30 0x00007f7d62868598 in g_main_context_iterate.constprop.0 (context=0x5621baacbc60, block=1, dispatch=1, self=<optimized out>) at ../glib/gmain.c:4238
#31 0x00007f7d6280ff40 in g_main_context_iteration (context=context@entry=0x5621baacbc60, may_block=may_block@entry=1) at ../glib/gmain.c:4303
#32 0x00007f7d62a3e94d in g_application_run (application=0x5621baac0110, argc=argc@entry=1, argv=argv@entry=0x7ffc8f6c09c8) at ../gio/gapplication.c:2571
#33 0x00005621ba085867 in main (argc=1, argv=0x7ffc8f6c09c8) at ../src/main.c:44

I guess that confirms my theory?
Comment 15 Erin of Yukis 2022-12-18 17:24:30 UTC
(In reply to Oded Arbel from comment #11)
> (In reply to Nicolas Fella from comment #7)
> > Can reproduce with gnome-console when "Enable focus tracking" in the Zoom
> > effect settings is enabled
> 
> Indeed, when I  turn off focus tracking in the Zoom desktop effect, the
> problem goes away.

Hallelujah! This issue made a bunch of GTK3 and also Qt6 applications hard to borderline impossible to use. Thank you for providing a workaround for the time being!
(And no, I’m not exaggerating. The MuseScore Flatpak is completely unusable with this bug, for instance.)

(In reply to Oded Arbel from comment #12)
> The service :1.16 on my system currently (I'm not sure if I haven't rebooted
> since the original report) is the gvfs daemon. It exposes paths like
> /org/gtk/vfs.
> 
> From looking that the interface that code is trying to access -
> org.a11y.atspi.Accessible - I can't find any service in my session that
> implements this. I am not a dbus expert, so I don't know how common it is
> for method calls to non-existing interfaces to block until timeout, but I
> think that kwin should not issue calls - on the main thread - that can block
> for 25 seconds...

You are probably looking on the wrong bus there. A path like “/org/gnome/Console/a11y/8c7c8daa_c230_4323_b01b_a3a3afeafb2c” and interface like “org.a11y.atspi.Accessible“ would typically be found on the accessibility bus, not the session bus. To find the address of the accessibility bus in your session you can run:
qdbus org.a11y.Bus /org/a11y/bus org.a11y.Bus.GetAddress

This resulted in “unix:path=/run/user/1000/at-spi/bus_1,guid=3d7d44553afddba65e947310639f38b5” on my system (the “path” part being fairly constant across systems and sessions, I believe).

With the bus address in hand, you can then issue another D-Bus call on that bus to ask the bus master which PID the process owning any given name had when it connected to the bus (where “:1.16“ is the service name):
qdbus --bus "unix:path=/run/user/1000/at-spi/bus_1,guid=3d7d44553afddba65e947310639f38b5" org.freedesktop.DBus /org/freedesktop/DBus org.freedesktop.DBus.GetConnectionUnixProcessID ":1.16"

This returns a PID for which you can look up the process command-line using procfs (replacing “3754” with what was returned by the previous call):
cat "/proc/3754/cmdline" | tr '\0' ' '

… of course, one can also use the PID to attach a debugger like Nicolas did.


Of course, in theory KWin should either be completely asynchronous or at least never block on calls to anything other than system services and a select few core non-GUI session services. In practice and particularly given KWin’s history, its not that easy of course. :-)
Comment 16 Nicolas Fella 2023-12-08 20:25:09 UTC

*** This bug has been marked as a duplicate of bug 450940 ***