Summary: | Wayland: iGPU/AMDGPU multi-monitor keeps displaying the SDDM screen if iGPU-DisplayPort is connected | ||
---|---|---|---|
Product: | [Plasma] kwin | Reporter: | Nowa Ammerlaan <nowa> |
Component: | wayland-generic | Assignee: | Zamundaaa <xaver.hugl> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | nate, nowa, xaver.hugl |
Priority: | NOR | Keywords: | wayland |
Version: | 5.21.0 | ||
Target Milestone: | --- | ||
Platform: | Gentoo Packages | ||
OS: | Linux | ||
See Also: | https://bugs.kde.org/show_bug.cgi?id=457851 | ||
Latest Commit: | https://invent.kde.org/plasma/kwin/commit/b38bb416982babdae9941d41fa5b34717e5cae97 | Version Fixed In: | 5.23 |
Sentry Crash Report: | |||
Attachments: |
wayland session log
wayland session log with env variables wayland log with the patch SIGSEGV backtrace kwin_wayland wayland session log (patch applied) wayland session log (patch applied) wayland session log (patch applied) wayland-session-log kwin_wayland backtrace wayland-session-log wayland session log before applying patch wayland session log after applying the patch wayland-session-log (patch applied) wayland-session-log-kwin-5.22.5 |
Description
Nowa Ammerlaan
2021-02-17 19:34:01 UTC
Okay, that is weird:
> Kwin exited with code 0
So you're probably not gonna be able to get a backtrace... KWin doesn't crash. What does crash is "plasma_session" though; if you have it installed with debug symbols then you should be able to use "coredumpctl debug plasma_session" and show the backtrace with "bt"
Could you try the session again with
QT_LOGGING_RULES="kwin_*.debug=true;kwin_libinput.debug=false"
as an environment variable? That could give more information on what's happening, at least what's happening with the outputs.
I re-tested with the environment variables you suggested, and also recompiled, kwin, kscreen and libinput with the debug symbols. This time it did not crash when trying to change the resolution. However, there was no visible effect of changing the resolution on the monitors connected to the iGPU. It did crash when I tried to enable a monitor connected to the iGPU that I had previously disabled. However, this time it did respawn. Something I noticed now, that I did not notice last time, is that the monitor configuration is not what the kscreen configuration window says it is. I set it to: ________________ [ ] [ ] [ AMDGPU2 ] [ ] [________________] ____________ _________________ ___________ ][ ][ ] ][ ][ ] iGPU1 ][ AMDGPU1 ][ iGPU2 ] ][ ][ ] ____________][________________][___________] ________________ [ ] [ ] [ iGPU3 ] [ ] [________________] But got instead: ________________ [ ] [ ] [ AMDGPU2 ] [ ] [________________] ____________ _________________ ___________ ][ ][ ] ][ ][ ] iGPU3 ][ AMDGPU1 ][ iGPU1 ] ][ ][ ] ____________][________________][___________] ________________ [ ] [ ] [ iGPU1 ] [ ] [________________] Disabling a monitor connected to the iGPU has no effect on what is shown on the monitor, it does however prevent the mouse from moving onto that monitor. However the monitor that is set as disabled is not the monitor onto which the mouse can no longer move. E.g. disabling iGPU1, prevents the mouse from moving onto iGPU2. However, the mouse can actually still move onto iGPU1. To me it looks like the monitors are somehow mixed up, the information that is shown about each monitor is correct (e.g. name, resolution, refresh rate). However, applying operations, such as disabling or moving, to that monitor in the configuration window will actually apply these operations to a different monitor. I will attach the new log, something that caught my eye is this: kwin_wayland_drm: Atomic request failed to commit: Invalid argument and this: kwin_wayland_drm: Atomic request failed to commit: Permission denied The log is littered with these. Created attachment 135852 [details]
wayland session log with env variables
Small correction, I got: ________________ [ ] [ ] [ AMDGPU2 ] [ ] [________________] ____________ _________________ ___________ ][ ][ ] ][ ][ ] iGPU2 ][ AMDGPU1 ][ iGPU3 ] ][ ][ ] ____________][________________][___________] ________________ [ ] [ ] [ iGPU1 ] [ ] [________________] My previous sketch had iGPU1 twice, which is of course wrong. 3 displays is the maximum a Intel iGPU can drive (at least for older ones than Tigerlake) and so it could be KWin a wrong combination of DRM objects that doesn't work; we don't handle that gracefully yet. Could you try what happens if you only plug in one monitor to the iGPU? By (un)plugging some of the monitors I found something interesting. The problem is with HDMI-1-2(which contrary to what the name suggests is a DisplayPort), if this iGPU-port is connected I run into the problem I described. - If this port is not connected everything works fine (2 remaining on the iGPU, 4 total). - If I move the monitor that was connected to this port, to a port on the AMDGPU everything is also fine (same total number of monitors, 5) - If I remove one of the other monitors connected to the iGPU, I still have the same problem. - If I remove all monitors connected to the iGPU, but leave only this port connected, I still have the same problem. - If I connect a different monitor to this port, I still have the same problem (therefore, it is not specific to that monitor, it is the port) I don't know if this is related, but I've had other issues with this specific port as well. If this port is connected *and* the iGPU is the boot-GPU (no matter what other things are connected) the system POST is not displayed at all, and the BIOS and Grub are inaccessible. In this situation the monitors only start showing something once the linux kernel takes over the framebuffer. And for some reason the naming of the ports on the iGPU is all messed up (and has been for as long as I had this computer), DisplayPort is named HDMI-1-2, VGA is named DP-1-5, DVI is named HDMI-1-3 but that is probably a kernel thing and unrelated. I should add that this port is connected through a DisplayPort to HDMI adapter, since I do not have a DisplayPort capable monitor I cannot test without it. So whenever I say that this port 'is connected' it is implied that this is through this converter. That being said, this port works just fine in X, so there is still some bug here. That does sound like it is indeed the problem of kwin_wayland not being smart enough when driving the iGPU. I've been working on some changes that should enable us to fix that but they won't go into 5.21. The thing with the port having problems overall is interesting but most likely won't cause problems once we handle failures. A possibly relevant merge request was started @ https://invent.kde.org/plasma/kwin/-/merge_requests/844 The linked merge request will most likely solve your problem but it would be great if you could test it To apply the patch successfully I had to use the latest version from git (for kwin and some other packages). The patch does not apply to the 5.21.4 release. After upgrading and applying the patch, a X11 session still works. However, now wayland does not work at all, it just flashes the screen a bit on the monitors connected to the AMDGPU and then crashes. The monitors on the iGPU continue to display the SDDM screen, now this happens irrespective of whether the problematic port is connected or not. I am not sure if the problem is with the patch from the Merge Request, or with the upgrade to the latest version from git. The log doesn't seem to show anything helpful, but I'll attach it anyway. Created attachment 137498 [details]
wayland log with the patch
I just tested without the patch, and then it still crashes. So the problem does not seem to be with the patch necessarily but with some other change in the live git version. Perhaps I just got unlucky and synced it while it was in a broken state. I'm not sure how I could test this more properly, any suggestions? You can try using > coredumpctl debug kwin_wayland and then use "bt" in gdb to get the stacktrace of the last crash. If you use the environment variable > QT_LOGGING_RULES="kwin_*.debug=true;kwin_libinput.debug=false" again then there could be something useful in the wayland-session.log, too. Created attachment 137511 [details]
SIGSEGV backtrace kwin_wayland
Created attachment 137512 [details]
wayland session log (patch applied)
I hope this is useful
Yes, very useful. https://invent.kde.org/plasma/kwin/-/merge_requests/847 should fix that Created attachment 137515 [details] wayland session log (patch applied) (In reply to Zamundaaa from comment #16) > Yes, very useful. https://invent.kde.org/plasma/kwin/-/merge_requests/847 > should fix that Awesome, this indeed fixed it :) However, the original issue is not quite fixed yet (though there is definitely some progress with this patch). If I apply the patch the problem becomes sort of inverted. Now the monitors connected to the iGPU work just fine (even the problematic port!). However, the monitors connected to the AMDGPU turn off after logging in. The kscreen config window does detect and display them correctly and they are marked as enabled (though the monitor itself says "No Signal"). If I disable and enable those monitors in the kscreen config window, they do turn on but the desktop completely freezes shortly afterwards. Without the patch the original behaviour described in the first comment is restored (AMDGPU monitors work, but iGPU monitors display SDDM screen). I don't have a coredump this time because there is no crash, though I have attached the wayland session log. Hmm, the log says that the tests reported the outputs as working but presentation fails afterwards... I added a commit that might fix that. Good News! You're patch works! I am writing this in a fully functioning wayland session. All monitors now work as expected. There seems to be a small issue on shutdown which may or may not be related. Sometimes the monitors connected to the iGPU retain their contents after the session has quit and the shutdown process has started. Some services fail to stop (e.g. bluetooth), which indicates to me that the session might be (partially) re-spawning when it shouldn't. But this could very well be an unrelated issue. Anyway with your patch my setup works with wayland! Looks like I will finally be able to join the rest of the world in the future that is wayland :D Thanks! Created attachment 137532 [details]
wayland session log (patch applied)
Cool. While the log unexpectedly still doesn't show any tests failing for the amd gpu I don't think that's something to worry about. > Sometimes the monitors connected to the iGPU retain their contents after the session has quit and the shutdown process has started That is a separate issue but should definitely be fixed as well. Properly blanking the monitors before exit shouldn't be hard to do. The session not closing everything properly is probably https://bugs.kde.org/show_bug.cgi?id=433293 Patches the merge request depend on have been merged now, and I updated the merge request to the new code. Could you test it again? It should in theory work the same but it would be good to make sure. Created attachment 140013 [details] wayland-session-log (In reply to Zamundaaa from comment #22) > Patches the merge request depend on have been merged now, and I updated the > merge request to the new code. Could you test it again? It should in theory > work the same but it would be good to make sure. After upgrading to the latest live version wayland is refusing to start (both with and without the patch from your Merge Request). The screen flashes fast and frequently, and eventually returns to SDDM (does not matter if the port that was problematic before is connected or not). Wayland session log is attached. X does still work though. KWin crashes, apparently when starting to render. Can you provide a backtrace? Created attachment 140018 [details] kwin_wayland backtrace (In reply to Zamundaaa from comment #24) > KWin crashes, apparently when starting to render. Can you provide a > backtrace? Here's the backtrace, I hope it is helpful Git commit afcef2a6f822c46d3167f4b49bb8a3b13696c8c8 by Xaver Hugl. Committed on 15/07/2021 at 11:42. Pushed by zamundaaa into branch 'master'. platforms/drm: fix crash with secondary GPUs and buffer age M +2 -1 src/plugins/platforms/drm/abstract_egl_drm_backend.h M +24 -12 src/plugins/platforms/drm/egl_gbm_backend.cpp M +4 -3 src/plugins/platforms/drm/egl_gbm_backend.h M +1 -1 src/plugins/platforms/drm/egl_stream_backend.cpp https://invent.kde.org/plasma/kwin/commit/afcef2a6f822c46d3167f4b49bb8a3b13696c8c8 crash should be fixed (at least it works with vkms) Created attachment 140080 [details] wayland-session-log (In reply to Zamundaaa from comment #27) > crash should be fixed (at least it works with vkms) Yes the crash is fixed now, Thanks. After applying the patch from your Merge Request I get a proper display on one of the monitors connected to the iGPU, the other two continue to display the SDDM screen. The monitor connected to the AMDGPU (DVI-D-1) turns black. I can move my mouse onto those monitors, and they are properly shown in Display Settings (correct configuration and correct resolution etc). I see some "Invalid argument" in the logs, so perhaps that is the cause. I rebased the MR to include a related bugfix from master, could you test again? If it still fails in some way, could you also use the environment variable QT_LOGGING_RULES="kwin_*.debug=true;kwin_libinput.debug=false" again? On errors the output is rather verbose but debug output still helps We're getting close, Without the patch I get a working display on the monitor connected to the AMDGPU, and on *one* of the monitors connected to the iGPU (the one that was problematic before, but that might be a coincidence). With the patch I get a working display on *all* monitors connected to the iGPU. However, the monitor connected to the AMDGPU stays black. If I go to the Display Settings the monitor is there. If I disabled and enable this monitor, all of the displays hang/freeze but I do see the desktop pop up on that monitor before it freezes and the screen corrupts. Created attachment 140193 [details]
wayland session log before applying patch
Created attachment 140194 [details]
wayland session log after applying the patch
What seems to have caused the issue is that the display turns off because it doesn't receive new frames while KWin is initializing - I think it takes too long to render the first frame. Maybe we can push the old frame again or something like that... There was however also a bug in KWins handling of taking control over the display, which made it not recover from that situation. I added a fix to the MR Created attachment 140219 [details] wayland-session-log (patch applied) Awesome, now it works! The monitor connected to the AMDGPU does still go to black before eventually showing the splash screen. It stays black for about a second (the other monitors are already showing the splash screen at this stage). Hot(un)plugging works. However, when hotunplugging a monitor all windows are moved to the monitor connected to the AMDGPU (which is marked as the "primary" monitor, i.e. monitor 1). And when hotplugging a monitor all windows are moved to the monitor that just connected. (But maybe this behaviour is intentional?) There is also Bug 438508 which is possibly related since it only occurs when using iGPU multimonitor. If I recall correctly, it didn't happen when I tested yesterday, but it is happening now so perhaps it is maybe related to these changes? Thank you for working on this! And one other small thing I noticed is that after hotunplugging and hotplugging one of the monitors the widgets and taskbar on that monitor are gone (i.a.w. it detects it as if it were a new monitor and it doesn't restore the settings) (In reply to Andrew Ammerlaan from comment #35) > And one other small thing I noticed is that after hotunplugging and > hotplugging one of the monitors the widgets and taskbar on that monitor are > gone (i.a.w. it detects it as if it were a new monitor and it doesn't > restore the settings) After looking at the logs, it appears that it detects the hotplugged monitor (HDMI-A-2-HKC-TV) multiple times instead. Nice! > Hot(un)plugging works. However, when hotunplugging a monitor all windows are moved to the monitor connected to the AMDGPU (which is marked as the "primary" monitor, i.e. monitor 1). And when hotplugging a monitor all windows are moved to the monitor that just connected. Yes that is intentional. I think there is a MR that implements moving the windows back to where they were before hot-unplugging though. > The monitor connected to the AMDGPU does still go to black before eventually showing the splash screen. It stays black for about a second (the other monitors are already showing the splash screen at this stage). I'll see what I can do about that - the kernel likely expects that we immediately push a frame once we take over. Maybe we can push an empty frame (so that it shows the same image again) as opposed to a black image for example to not cause flickering. > And one other small thing I noticed is that after hotunplugging and hotplugging one of the monitors the widgets and taskbar on that monitor are gone (i.a.w. it detects it as if it were a new monitor and it doesn't restore the settings) That's unfortunately a known bug in plasmashell. Could you check if the monitor still turns off on login now? I added a commit that might resolve that (In reply to Zamundaaa from comment #38) > Could you check if the monitor still turns off on login now? I added a > commit that might resolve that Perfect, it works flawlessly now. There's a very nice transition from sddm to black to a fade-in of the splash screen, it is the same on all monitors! Thank you very much for your efforts Git commit b38bb416982babdae9941d41fa5b34717e5cae97 by Xaver Hugl. Committed on 08/09/2021 at 00:44. Pushed by zamundaaa into branch 'master'. Test DrmPipelines for outputs Not all combinations of connectors, crtcs and planes will work on all hardware, so we need to test the pipelines before using them. Related: bug 435265 M +170 -162 src/plugins/platforms/drm/drm_gpu.cpp M +5 -7 src/plugins/platforms/drm/drm_gpu.h M +0 -1 src/plugins/platforms/drm/drm_object_connector.h M +13 -0 src/plugins/platforms/drm/drm_output.cpp M +4 -0 src/plugins/platforms/drm/drm_output.h M +81 -78 src/plugins/platforms/drm/drm_pipeline.cpp M +15 -4 src/plugins/platforms/drm/drm_pipeline.h https://invent.kde.org/plasma/kwin/commit/b38bb416982babdae9941d41fa5b34717e5cae97 Created attachment 141519 [details]
wayland-session-log-kwin-5.22.5
After upgrading to version 5.22.5 this is sadly still not quite working (though it was working earlier when I tested the MR). The monitor connected to the AMDGPU works and one of the monitors on the iGPU (the right one). But the others still continue to show the SDDM screen after logging in. Wayland log attached.
I'm seeing a bunch of these:
kwin_core: Provided presentation timestamp is invalid: 154666 (current: 154675)
kwin_wayland_drm: Atomic request failed to commit: Invalid argument
kwin_wayland_drm: Atomic test commit failed. Aborting present.
Furthermore, the issue where the monitor on the AMDGPU turns black for about a second or two after logging in is back :(
The commit is only in git master / upcoming 5.23 (In reply to Zamundaaa from comment #42) > The commit is only in git master / upcoming 5.23 Alrighty, I see, I'll wait a bit more then and test again later :D In version 5.25 this is broken again. After login the monitor connected to the AMDGPU turns itself off, and the monitor connected to the iGPU continues to display the SDDM login screen. If I switch to a tty and back I get a black screen on all monitors but the mouse is now visible. |