Summary: | Mouse clicks from openQA (automated test framework) in Fedora installer on KDE 5.23.90 stop working shortly after it starts | ||
---|---|---|---|
Product: | [Plasma] kwin | Reporter: | Adam Williamson <adamw> |
Component: | input | Assignee: | KWin default assignee <kwin-bugs-null> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | fabian, nate, rdieter |
Priority: | NOR | Keywords: | wayland |
Version: | 5.23.90 | ||
Target Milestone: | --- | ||
Platform: | Fedora RPMs | ||
OS: | Linux | ||
Latest Commit: | https://invent.kde.org/plasma/kwin/commit/bac4f4ed62017bb4aeb81078f5ec2cd915af5de4 | Version Fixed In: | |
Sentry Crash Report: | |||
Attachments: |
how the screen we're trying to interact with looks
WAYLAND_DEBUG=1 dolphin KWIN_XWAYLAND_DEBUG=1 messages from anaconda my debugging |
Description
Adam Williamson
2022-01-28 00:16:25 UTC
Created attachment 146003 [details]
how the screen we're trying to interact with looks
I tested a build of current git 5.24 head, same bug. I'll try and bisect back between 5.23.5 and 5.23.90, though it will take a bit of time to do. Bisecting is turning out to be hard because older commits just don't build against current (everything else), like kwayland-server :( Can't build anything before f2b29e35 by the looks of it due to an incompatibility with `setSupportedFormatsAndModifiers`, and f2b29e35 doesn't build because of a "static assertion failed: Signal and slot arguments are not compatible." error I haven't totally traced out yet. But even just those two mean I'm missing like several months of history in the bisect attempt, so it's not likely to work out well. Can you link to a failed openQA job? With Plasma 5.24 (and git master), something vaguely related happened in our testing as well. Until https://invent.kde.org/plasma/kwin/-/merge_requests/1946, kwin didn't use the cursor plane with virtio_gpu, so the cursor was visible to openQA as well. This uncovered that the mouse_hide call did not have any effect at all: https://openqa.opensuse.org/tests/2160675#step/start_wayland_plasma5/26 The cursor moved fine when connecting to VNC manually. Sure, here's one of the most recent failures - https://openqa.fedoraproject.org/tests/1114561 . Logs and video at https://openqa.fedoraproject.org/tests/1114561#downloads . You should be able to see from the autoinst-log.txt log that the mouse_hide calls do seem to take effect - there are several moves to 1023, 767 which are mouse_hide events (either explicit ones or the ones that happen as part of assert_and_click ). I did run a test where I disabled the mouse_hide that happens at the end of assert_and_click , and the result of that is that the test gets one step further - the Continue button works, but it fails at the next step. The next thing that happens is a child window pops up with a warning about this being a pre-release, with a button you have to click to continue. It looks like this (from the Workstation live test): https://openqa.fedoraproject.org/tests/1114249#step/_boot_to_anaconda/12 clicking on the "I want to proceed" button there doesn't work. It does seem potentially interesting that that button is in a different window. I wonder if I could get past it somehow, would subsequent clicks back in the main installer window continue to work. Next week I'll see if we can get past it with keyboard inputs (I haven't actually checked that yet). I just realized I didn't mention, so in case it's not obvious - the installer app in use here is built on GTK+. I think I have another related failure: https://openqa.opensuse.org/tests/2167096 Dolphin thinks the cursor is hovering over the folder icon there. (In reply to Adam Williamson from comment #5) > Sure, here's one of the most recent failures - > https://openqa.fedoraproject.org/tests/1114561 . Logs and video at > https://openqa.fedoraproject.org/tests/1114561#downloads . You should be > able to see from the autoinst-log.txt log that the mouse_hide calls do seem > to take effect - there are several moves to 1023, 767 which are mouse_hide > events (either explicit ones or the ones that happen as part of > assert_and_click ). In the video the scrollbar of the right locale selection box is visible when the cursor is inside, so GTK is at least aware of the cursor position there. I don't see any hover effect on the continue button, but that might be normal. (In reply to Adam Williamson from comment #6) > I just realized I didn't mention, so in case it's not obvious - the > installer app in use here is built on GTK+. Using wayland? If you can reproduce the issue with WAYLAND_DEBUG=1, that might be helpful. I'll try it with dolphin as well. Created attachment 146129 [details] WAYLAND_DEBUG=1 dolphin (In reply to Fabian Vogt from comment #7) > I think I have another related failure: > https://openqa.opensuse.org/tests/2167096 > Dolphin thinks the cursor is hovering over the folder icon there. > ... > I'll try it with dolphin as well. Journal with WAYLAND_DEBUG=1 output of dolphin in the failed openQA test https://openqa.opensuse.org/tests/2170688#step/dolphin/23 is attached. Only issue I can spot is that KWin sends wl_pointer@4.leave(1624, nil), even though the surface parameter does not have allow-null="true". is it possible to run anaconda with the WAYLAND_DEBUG=1 environment variable? Um. I think maybe not, because it seems when running as the live installer, we actually force anaconda to run with the X11 backend: https://github.com/rhinstaller/anaconda/blob/master/data/liveinst/liveinst#L134-L135 liveinst is a wrapper script which is what gets called to run anaconda on live images - it does various customizations specific to the live environment, like that one, before running anaconda itself. Can you set the KWIN_XWAYLAND_DEBUG environment variable in that case and capture kwin's log output? Yeah, I can give that a shot tomorrow. Thanks. Created attachment 146234 [details]
KWIN_XWAYLAND_DEBUG=1 messages from anaconda
This is /var/log/messages from an openQA failure with KWIN_XWAYLAND_DEBUG=1 set via /etc/profile.d. The relevant messages should be around the 13:09:21 mark in the log - the messages from Anaconda about setting the locale to en_US.UTF-8 should come from the point where the test types 'english' into the entry box and clicks on "English (United States)", right before the point where it tries to click Continue and it doesn't work.
The timestamps on the logs from the VM itself and the logs from the test runner don't always match precisely, but FWIW, here are the logs of the critical point from the test runner:
[2022-02-03T18:09:27.871Z] [debug] tests/_boot_to_anaconda.pm:167 called testapi::assert_and_click
[2022-02-03T18:09:27.871141Z] [debug] <<< testapi::assert_and_click(mustmatch="anaconda_select_install_lang_filtered", timeout=30)
[2022-02-03T18:09:27.871998Z] [debug] clicking at 686/206
[2022-02-03T18:09:27.872119Z] [debug] tests/_boot_to_anaconda.pm:167 called testapi::assert_and_click
[2022-02-03T18:09:27.872211Z] [debug] <<< testapi::mouse_set(x=686, y=206)
[2022-02-03T18:09:27.872740Z] [debug] mouse_move 686, 206
[2022-02-03T18:09:27.872876Z] [debug] send_pointer_event 0, 686, 206, 1
[2022-02-03T18:09:27.873272Z] [debug] tests/_boot_to_anaconda.pm:167 called testapi::assert_and_click
[2022-02-03T18:09:27.873381Z] [debug] <<< testapi::mouse_click(button="left", cursor_down="0.15")
[2022-02-03T18:09:27.873867Z] [debug] pointer_event 1 686, 206
[2022-02-03T18:09:27.874004Z] [debug] send_pointer_event 1, 686, 206, 1
[2022-02-03T18:09:28.024935Z] [debug] pointer_event 0 686, 206
[2022-02-03T18:09:28.025082Z] [debug] send_pointer_event 0, 686, 206, 1
[2022-02-03T18:09:29.025684Z] [debug] tests/_boot_to_anaconda.pm:167 called testapi::assert_and_click
[2022-02-03T18:09:29.025885Z] [debug] <<< testapi::mouse_set(x=1023, y=767)
[2022-02-03T18:09:29.026785Z] [debug] mouse_move 1023, 767
[2022-02-03T18:09:29.026920Z] [debug] send_pointer_event 0, 1023, 767, 1
[2022-02-03T18:09:29.027420Z] [debug] tests/_boot_to_anaconda.pm:168 called testapi::assert_screen
[2022-02-03T18:09:29.027602Z] [debug] <<< testapi::assert_screen(mustmatch="anaconda_select_install_lang_selected", timeout=10)
[2022-02-03T18:09:29.038653Z] [debug] no match: 9.9s, best candidate: install_lang_english_selected-testonlycursor-20220128 (0.93)
[2022-02-03T18:09:30.173707Z] [debug] >>> testapi::_handle_found_needle: found install_lang_english_selected-20200116, similarity 0.98 @ 605/197
[2022-02-03T18:09:30.173913Z] [debug] tests/_boot_to_anaconda.pm:175 called testapi::assert_and_click
[2022-02-03T18:09:30.174063Z] [debug] <<< testapi::assert_screen(mustmatch="anaconda_select_install_lang_continue", timeout=30)
[2022-02-03T18:09:31.196994Z] [debug] >>> testapi::_handle_found_needle: found install_lang_continue-20210714, similarity 0.99 @ 926/684
[2022-02-03T18:09:31.197187Z] [debug] tests/_boot_to_anaconda.pm:175 called testapi::assert_and_click
[2022-02-03T18:09:31.197326Z] [debug] <<< testapi::assert_and_click(mustmatch="anaconda_select_install_lang_continue", timeout=30)
[2022-02-03T18:09:31.198252Z] [debug] clicking at 970/696
[2022-02-03T18:09:31.198373Z] [debug] tests/_boot_to_anaconda.pm:175 called testapi::assert_and_click
[2022-02-03T18:09:31.198478Z] [debug] <<< testapi::mouse_set(x=970, y=696)
[2022-02-03T18:09:31.199030Z] [debug] mouse_move 970, 696
[2022-02-03T18:09:31.199168Z] [debug] send_pointer_event 0, 970, 696, 1
[2022-02-03T18:09:31.199546Z] [debug] tests/_boot_to_anaconda.pm:175 called testapi::assert_and_click
[2022-02-03T18:09:31.199650Z] [debug] <<< testapi::mouse_click(button="left", cursor_down="0.15")
[2022-02-03T18:09:31.200157Z] [debug] pointer_event 1 970, 696
[2022-02-03T18:09:31.200271Z] [debug] send_pointer_event 1, 970, 696, 1
[2022-02-03T18:09:31.351157Z] [debug] pointer_event 0 970, 696
[2022-02-03T18:09:31.351279Z] [debug] send_pointer_event 0, 970, 696, 1
[2022-02-03T18:09:32.351940Z] [debug] tests/_boot_to_anaconda.pm:175 called testapi::assert_and_click
[2022-02-03T18:09:32.352249Z] [debug] <<< testapi::mouse_set(x=1023, y=767)
[2022-02-03T18:09:32.354309Z] [debug] mouse_move 1023, 767
[2022-02-03T18:09:32.354443Z] [debug] send_pointer_event 0, 1023, 767, 1
That's clicking on "English (US)" (at 686/206), moving mouse to 1023/767, then clicking on the continue button (970/696), then moving mouse to 1023/767 again.
2022-02-03T13:09:29.767650-05:00 fedora kwin_wayland_wrapper[1554]: [ 152321.189] wl_pointer@22.enter(78, wl_surface@37, 970.00000000, 696.00000000) 2022-02-03T13:09:29.768280-05:00 fedora kwin_wayland_wrapper[1554]: [ 152322.090] -> wl_pointer@22.set_cursor(78, wl_surface@17, 4, 4) 2022-02-03T13:09:29.768728-05:00 fedora kwin_wayland_wrapper[1554]: [ 152322.589] -> wl_surface@17.attach(wl_buffer@47, 0, 0) 2022-02-03T13:09:29.769154-05:00 fedora kwin_wayland_wrapper[1554]: [ 152322.978] -> wl_surface@17.damage_buffer(0, 0, 32, 32) 2022-02-03T13:09:29.769432-05:00 fedora kwin_wayland_wrapper[1554]: [ 152323.393] -> wl_surface@17.frame(new id wl_callback@130) 2022-02-03T13:09:29.769649-05:00 fedora kwin_wayland_wrapper[1554]: [ 152323.701] -> wl_surface@17.commit() 2022-02-03T13:09:29.769839-05:00 fedora kwin_wayland_wrapper[1554]: [ 152323.892] wl_pointer@22.frame() 2022-02-03T13:09:29.770067-05:00 fedora kwin_wayland_wrapper[1554]: [ 152324.114] wl_pointer@22.frame() 2022-02-03T13:09:29.783520-05:00 fedora kwin_wayland_wrapper[1554]: [ 152337.239] wl_pointer@22.button(79, 88815, 272, 1) 2022-02-03T13:09:29.783802-05:00 fedora kwin_wayland_wrapper[1554]: [ 152337.843] wl_pointer@22.frame() 2022-02-03T13:09:29.854266-05:00 fedora kwin_wayland_wrapper[1554]: [ 152408.065] wl_pointer@22.button(80, 88901, 272, 0) 2022-02-03T13:09:29.854581-05:00 fedora kwin_wayland_wrapper[1554]: [ 152408.597] wl_pointer@22.frame() kwin seems to send relevant pointer events to xwayland > So this repositioning behaviour may be significant (it was in a similar bug we found in GNOME recently).
what was the bug?
https://gitlab.gnome.org/GNOME/mutter/-/issues/2117 , and the fix was https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/2257 . Definitely not the same bug as it's mutter and mutter isn't involved here, but it was an odd coincidence that they showed up around the same time. Could you check whether the issue can be reproduced with the latest kwin 5.24 tars? kwin's tar was respun and it included a fix for a similar bug As mentioned above I tried a git snapshot on 2022-01-28. If the fix landed after that, though, it's a possibility. I'll try in the morning. Yes, it landed a couple of days ago I just checked, and it looks like we've had 5.24.0 in Rawhide for a few days. Unfortunately the bug is still present :( Created attachment 146454 [details]
my debugging
I cleaned up the attached logs and don't see anything suspicious in the logs, kwin sends button click and all button press events have matching release events
what does happen if the test clicks the window the second time at 970x696 after some interval?
Now that Plasma 5.24.0 is exposed to more tests as part of Tumbleweed QA, this issue appears in two more places now: Clicking the connect button in hexchat (GTK2, so Xwayland) does nothing: https://openqa.opensuse.org/tests/2180598#step/hexchat/27 Clicking the play button in VLC (Qt!) does nothing: https://openqa.opensuse.org/tests/2180598#step/vlc/24 Oh well, that answers a question I was about to look into, which was "could this be to do with https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/4123 ?" - it sounds like the answer is 'no'. I'm double-checking the answer to Vlad's question to me right now, I'm pretty sure the answer is "clicking twice doesn't help at all" but just double checking that. Yeah, confirmed, clicking twice in the same place does not help at all. Neither with nor without the "mouse hide" operation in between clicks. I can reproduce the issue by sending just the simulated click to the QEMU VNC server of an openQA test with a python script. I could narrow it down to the following scenario: send_pointer_event(0, 1021, 766) time.sleep(1) send_pointer_event(0, 634, 584) Even though the cursor is still in the xev area, it got a leave event! EnterNotify event, serial 36, synthetic NO, window 0xc00001, root 0x51c, subw 0xc00002, time 8670092, (31,38), root:(634,584), mode NotifyNormal, detail NotifyVirtual, same_screen YES, focus YES, state 0 KeymapNotify event, serial 36, synthetic NO, window 0x0, keys: 28 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 MotionNotify event, serial 36, synthetic NO, window 0xc00001, root 0x51c, subw 0xc00002, time 8670092, (31,38), root:(634,584), state 0x0, is_hint 0, same_screen YES LeaveNotify event, serial 36, synthetic NO, window 0xc00001, root 0x51c, subw 0xc00002, time 8670092, (418,220), root:(1021,766), mode NotifyNormal, detail NotifyVirtual, same_screen YES, focus YES, state 0 Feb 23 11:07:53 localhost.localdomain kwin_wayland_wrapper[5363]: [2868737.319] wl_pointer@22.leave(733, wl_surface@34) Feb 23 11:07:53 localhost.localdomain kwin_wayland_wrapper[5363]: [2868737.357] wl_pointer@22.frame() Feb 23 11:07:54 localhost.localdomain kwin_wayland_wrapper[5363]: [2869657.957] wl_pointer@22.enter(734, wl_surface@34, 31.00000000, 67.00000000) Feb 23 11:07:54 localhost.localdomain kwin_wayland_wrapper[5363]: [2869658.037] wl_pointer@22.frame() Feb 23 11:07:54 localhost.localdomain kwin_wayland_wrapper[5363]: [2869658.055] wl_pointer@22.frame() Feb 23 11:07:54 localhost.localdomain kwin_wayland_wrapper[5363]: [2869662.349] -> wl_pointer@22.set_cursor(734, wl_surface@17, 1, 1) Feb 23 11:07:54 localhost.localdomain kwin_wayland_wrapper[5363]: [2869662.392] -> wl_surface@17.attach(wl_buffer@27, 0, 0) Feb 23 11:07:54 localhost.localdomain kwin_wayland_wrapper[5363]: [2869662.419] -> wl_surface@17.damage_buffer(0, 0, 10, 16) Feb 23 11:07:54 localhost.localdomain kwin_wayland_wrapper[5363]: [2869662.451] -> wl_surface@17.frame(new id wl_callback@30) Feb 23 11:07:54 localhost.localdomain kwin_wayland_wrapper[5363]: [2869662.468] -> wl_surface@17.commit() Feb 23 11:07:54 localhost.localdomain kwin_wayland_wrapper[5363]: [2869667.220] wl_display@1.delete_id(30) Feb 23 11:07:54 localhost.localdomain kwin_wayland_wrapper[5363]: [2869667.274] wl_callback@30.done(8670095) Disabling DataBridge::create(this); did not help. A possibly relevant merge request was started @ https://invent.kde.org/plasma/kwin/-/merge_requests/2064 Git commit aaa07f0605d605ae2472b1a8f7f4bd07f2138cb2 by Vlad Zahorodnii. Committed on 23/02/2022 at 20:03. Pushed by vladz into branch 'master'. Revert "Remove mysterious s_cursorUpdateBlocking boolean flag in pointer_input.cpp" This reverts commit 3d0bdc56a48fe0d5cc0889ab098f6c6c922d20c7. seat->setFocusedPointerSurface() before notifyPointerMotion() is needed to prevent sending a motion event that's outside the previously focused surface. M +11 -0 src/pointer_input.cpp https://invent.kde.org/plasma/kwin/commit/aaa07f0605d605ae2472b1a8f7f4bd07f2138cb2 Git commit bac4f4ed62017bb4aeb81078f5ec2cd915af5de4 by Vlad Zahorodnii. Committed on 23/02/2022 at 20:21. Pushed by vladz into branch 'Plasma/5.24'. Revert "Remove mysterious s_cursorUpdateBlocking boolean flag in pointer_input.cpp" This reverts commit 3d0bdc56a48fe0d5cc0889ab098f6c6c922d20c7. seat->setFocusedPointerSurface() before notifyPointerMotion() is needed to prevent sending a motion event that's outside the previously focused surface. (cherry picked from commit aaa07f0605d605ae2472b1a8f7f4bd07f2138cb2) M +11 -0 src/pointer_input.cpp https://invent.kde.org/plasma/kwin/commit/bac4f4ed62017bb4aeb81078f5ec2cd915af5de4 |