Bug 463353

Summary: Hang in KIconTheme update
Product: [Plasma] kwin Reporter: Wyatt Childers <kdebugs.81do7>
Component: wayland-genericAssignee: KWin default assignee <kwin-bugs-null>
Status: RESOLVED FIXED    
Severity: major CC: agurenko, boredsquirrel, bugseforuns, gertjan.rolink, kde, lidraritri, linux, nate, poperigby, ross.cannizzaro, yizel7
Priority: VHI    
Version: 5.26.4   
Target Milestone: ---   
Platform: Fedora RPMs   
OS: Linux   
Latest Commit: Version Fixed In: 5.103
Sentry Crash Report:
Attachments: Log of kwin-wayland

Description Wyatt Childers 2022-12-22 19:25:23 UTC
SUMMARY
Today while updating several flatpaks via discover, I noticed Kwin freezing (i.e., windows would not switch via keyboard shortcuts, the mouse wouldn't move, etc).

This happened at various points in the update, and once it was completed everything went back to normal

STEPS TO REPRODUCE
1. Update flatpaks via discover
2. Attempt to move mouse while updates process

OBSERVED RESULT
Kwin stops responding to user input.

EXPECTED RESULT
Kwin becomes unresponsive to user input.

SOFTWARE/OS VERSIONS
Operating System: Fedora Linux 37
KDE Plasma Version: 5.26.4
KDE Frameworks Version: 5.101.0
Qt Version: 5.15.7
Kernel Version: 6.0.12-300.fc37.x86_64 (64-bit)
Graphics Platform: Wayland
Processors: 32 × AMD Ryzen 9 7950X 16-Core Processor
Memory: 62.0 GiB of RAM
Graphics Processor: AMD Radeon RX 6700 XT
Manufacturer: ASUS

ADDITIONAL INFORMATION

This was during a FreeDesktop SDK update, so this was a "heavier" flatpak update than normal, and I *think* it was ultimately this update occurring (likely while having flatpaks running -- including Discord, Telegram, Todoist, Rhythmbox, and Slack) that caused the issue.
Comment 1 Ross Cannizzaro 2022-12-26 10:35:17 UTC
I am also experiencing the same issue.

SOFTWARE/OS VERSIONS
Operating System: Fedora Linux 37 (Kinoite)
KDE Plasma Version: 5.26.4
KDE Frameworks Version: 5.101.0
Qt Version: 5.15.7
Kernel Version: 6.0.14-300.fc37.x86_64 (64-bit)
Graphics Platform: Wayland
Processors: 12x AMD Ryzen 5 1600 Six-Core Processor
Memory: 15.1GiB of RAM
Graphics Processor: AMD Radeon RX 6600
Manufacturer: Micro-Star International Co., Ltd.
Comment 2 David Edmundson 2022-12-27 10:28:15 UTC
General performance issues whilst the computer is under load are not actionable bug reports.
Comment 3 Wyatt Childers 2022-12-27 12:20:01 UTC
That's completely bogus. The computer is not even remotely under load.

"Processors: 32 × AMD Ryzen 9 7950X 16-Core Processor"

This is not a "performance issue" it's a complete lockup of the window manager which doesn't happen when I run literally thousands of tests on the same computer pegging all cores at 100%.
Comment 4 Wyatt Childers 2022-12-27 12:48:17 UTC
I will say I did botch the expected result, though I think "kwin should continue to respond to user input" is inferable.

My personal theory is that when flatpak updates the runtimes, if one of the apps dependent on the runtime is running (maybe even if not?) Kwin gets into a situation where it's blocked waiting on something from the flatpak app (perhaps even Xwayland running in the sandbox) to respond (i.e. an operation which normally would be almost instant, that waiting on is normally totally fine, but during this update is blocked waiting on some part of the update).

My other theory is that perhaps the btrfs kernel IO scheduler is doing something funny. I've had some tests that normally run at subsecond take over a minute when lots of other tests running IO operations are occurring because the ioscheduler was just not queuing their operations.
Comment 5 Wyatt Childers 2022-12-27 12:51:05 UTC
I think I should further clarify on second thought. To my latter theory, kwin would be trying to write/read something from disk itself and waiting on that... It seems quite a bit less likely that's the case, as even a slow hard drive would then be able to recreate these conditions/impact user input.
Comment 6 Nate Graham 2023-01-09 19:46:52 UTC
*** Bug 463681 has been marked as a duplicate of this bug. ***
Comment 7 yizel7 2023-01-09 22:09:56 UTC
My ticket was marked as a duplicate so I will add details below. As it was suggested earlier that this is a performance issue from the system being under load that simply can NOT be the case. I have these issues from a cold boot with nothing but KDE Plasma and associated services running. My computer has 8 cores with 8 threads, 16 gb of RAM runs on an NVMe with 2,500+ MB/s read/write speed and a beefy Nvidia card and a Fiber gigabit connection. When this happens before the freeze happens you can see in system monitor the load is around 1-5% for CPU, less than 1.5gb of RAM and the disk is barely doing anything.

Could it be some kind of resource locking issue? In my ticket a user also reported this happening with Arch pacman updates. Could it be a Konsole/terminal bug? Flatpak updates are just a front end for discover after all, you can even see the output in Discover if you run another command in Konsole while it does flatpak stuff. And when doing installs of flatpaks I run those via Konsole and not Discover.

kwin 5.26.5-1
konsole 22.12.1-1
discover 5.26.5-1

/////////////////////////////////////////////////////////////////////////
SUMMARY
Lately when I update or install a flatpak my entire Plasma session becomes unresponsive until it is completed. This may take a couple seconds for something small or a minute or two when it is big flatpak runtime like nvidia drivers. During this time I can only move my mouse cursor around but cannot click on anything.

I have a beefy machine so I do not think it is performance related. My guess is some kind of locking is happening.

I do not think this is a Discover bug because it happens whether I am updating in Discover or installing via Konsole like flatpak install flathub org.mozilla.firefox

STEPS TO REPRODUCE
1. Update or install a flatpak 

OBSERVED RESULT
The entire Plasma sessions becomes unresponsive until the install or update is complete.

EXPECTED RESULT
No impact to performance

SOFTWARE/OS VERSIONS
Operating System: Arch Linux
KDE Plasma Version: 5.26.4
KDE Frameworks Version: 5.101.0
Qt Version: 5.15.7
Kernel Version: 6.0.15-hardened1-1-hardened (64-bit)

ADDITIONAL INFORMATION
Here are some of my flatpak and other related package infos
flatpak 1:1.15.1-1
bubblewrap 0.7.0-1
kdeplasma-addons 5.26.4-1
plasma-browser-integration 5.26.4-1
plasma-desktop 5.26.4-1
plasma-disks 5.26.4-1
plasma-firewall 5.26.4-1
plasma-framework 5.101.0-1
plasma-integration 5.26.4-1
plasma-meta 5.25-1
plasma-nm 5.26.4-1
plasma-pa 5.26.4-1
plasma-sdk 5.26.4-1
plasma-systemmonitor 5.26.4-1
plasma-thunderbolt 5.26.4-1
plasma-vault 5.26.4-1
plasma-wayland-session 5.26.4.1-1
plasma-workspace 5.26.4.1-1
plasma-workspace-wallpapers 5.26.4-1
Comment 8 Wyatt Childers 2023-01-10 04:59:40 UTC
I don't use Konsole, so it's not that (Alacritty + Tmux).

I also observed this during the "verify" stage of some standard DNF updates, but only some updates. I _think_ mesa and/or the kernel was included in that update, but I can't recall specifically... I'll try to do better on the next occurrence figuring out which package is being updated.
Comment 9 Wyatt Childers 2023-01-10 05:00:27 UTC
I updated the title to better reflect the issue is not limited to just flatpaks.
Comment 10 Wyatt Childers 2023-01-10 17:52:09 UTC
@Ross and @yizel7@kulodgei.com are you both using btrfs as well, or different file systems?

I've notice an increase in "wonky" IO scheduler behavior lately in general, use btrfs almost exclusively, and wonder if that's related.
Comment 11 Ross Cannizzaro 2023-01-11 11:38:51 UTC
(In reply to Wyatt Childers from comment #10)
> @Ross and @yizel7@kulodgei.com are you both using btrfs as well, or
> different file systems?
> 
> I've notice an increase in "wonky" IO scheduler behavior lately in general,
> use btrfs almost exclusively, and wonder if that's related.

I am using BTRFS.
Comment 12 Vlad Zahorodnii 2023-01-11 12:11:24 UTC
Can you attach logs to the bug report?

Put

  QT_LOGGING_RULES="kwin*.debug=true"

in /etc/environment

Restart the computer, update a flatpak package, and then get kwin's logs

  journalctl --user-unit plasma-kwin_wayland --boot -1 > log.txt

and attach the log file to this bug report
Comment 13 Gertjan Rolink 2023-01-11 19:33:43 UTC
(In reply to Vlad Zahorodnii from comment #12)
> Can you attach logs to the bug report?
> 
> Put
> 
>   QT_LOGGING_RULES="kwin*.debug=true"
> 
> in /etc/environment
> 
> Restart the computer, update a flatpak package, and then get kwin's logs
> 
>   journalctl --user-unit plasma-kwin_wayland --boot -1 > log.txt
> 
> and attach the log file to this bug report

Today, fedora send an update on flatpak. I installed/removed obsstudio 10 times in a row with a for loop and the system only froze once on the 6th install for like half a second. So this update definitely fixed the issue for me.
flatpak packages:
 - flatpak-1.14.1-2.fc37.x86_64
 - flatpak-libs-1.14.1-2.fc37.x86_64
 - flatpak-selinux-1.14.1-2.fc37.noarch
 - flatpak-session-helper-1.14.1-2.fc37.x86_64
Comment 14 Gertjan Rolink 2023-01-11 19:35:58 UTC
(In reply to Gertjan Rolink from comment #13)
> (In reply to Vlad Zahorodnii from comment #12)
> > Can you attach logs to the bug report?
> > 
> > Put
> > 
> >   QT_LOGGING_RULES="kwin*.debug=true"
> > 
> > in /etc/environment
> > 
> > Restart the computer, update a flatpak package, and then get kwin's logs
> > 
> >   journalctl --user-unit plasma-kwin_wayland --boot -1 > log.txt
> > 
> > and attach the log file to this bug report
> 
> Today, fedora send an update on flatpak. I installed/removed obsstudio 10
> times in a row with a for loop and the system only froze once on the 6th
> install for like half a second. So this update definitely fixed the issue
> for me.
> flatpak packages:
>  - flatpak-1.14.1-2.fc37.x86_64
>  - flatpak-libs-1.14.1-2.fc37.x86_64
>  - flatpak-selinux-1.14.1-2.fc37.noarch
>  - flatpak-session-helper-1.14.1-2.fc37.x86_64

Ok, had to do it a few more times. After a while it still started freezing for longer. So although it is slightly better, the issue still persists.
Comment 15 Gertjan Rolink 2023-01-11 19:46:53 UTC
Created attachment 155219 [details]
Log of kwin-wayland

Added log of journalctl after following the steps mentioned above.

System froze after several attempts at install/remove flatpak obsstudio.
Comment 16 yizel7 2023-01-14 00:39:31 UTC
(In reply to Wyatt Childers from comment #10)
> @Ross and @yizel7@kulodgei.com are you both using btrfs as well, or
> different file systems?
> 
> I've notice an increase in "wonky" IO scheduler behavior lately in general,
> use btrfs almost exclusively, and wonder if that's related.

btrfs. Sounds like we all have that in common. I just had a freeze for about 15 seconds just updating a small flatpak.
Comment 17 Gurenko Alex 2023-01-18 15:20:10 UTC
Interesting, I have the same problem for quite a while now, iirc it started actually with upgrade from F36 to F37, but what's even more interesting, I have this problem on my desktop (Ryzen 9 5900X + AMD GPU 6800XT and now 7900 XTX), but not on my Lenovo (intel based) laptop
Comment 18 Bernd Steinhauser 2023-01-19 15:48:05 UTC
Also affected by this running Exherbo Linux with btrfs as filesystem.
The screen freezes for about 15-20s when the shared mime database is updated during a package installation.
I'm sure it's an output freeze, since the digital clock (indicates seconds for me) stops as well.
I think it started somewhere around mid December, possibly with the change to 5.26.4, but I'm not sure.
Comment 19 David Edmundson 2023-01-20 16:32:47 UTC
I managed to reproduce and now have something actionable.

Installing a new flatpak potentailly triggers a notification that there's potentially icons in a new directory. (since https://invent.kde.org/frameworks/kded/-/merge_requests/21)

This causes every app to start re-searching all icon directories all at once.  That's our IO bottleneck and all this code is blocking the main thread.

I will make this bug about that. If there are other issues, we can make new bug reports once.
Comment 20 David Edmundson 2023-01-20 16:33:15 UTC
-I will make this bug about that. If there are other issues, we can make new bug reports once.
+I will make this bug about that. If there are other issues, we can make new bug reports once we have something else more concrete.
Comment 21 Patrick Silva 2023-01-20 22:44:17 UTC
On my Arch Linux sometimes Plasma hangs while pacman installs updates, not kwin.
I use ext4 file system. bug 463681 marked as duplicated is about Plasma.
Comment 22 David Edmundson 2023-01-21 10:23:20 UTC
*** Bug 464568 has been marked as a duplicate of this bug. ***
Comment 23 Bernd Steinhauser 2023-01-21 12:02:15 UTC
(In reply to Patrick Silva from comment #21)
> On my Arch Linux sometimes Plasma hangs while pacman installs updates, not
> kwin.
> I use ext4 file system. bug 463681 marked as duplicated is about Plasma.

How can you distinguish between plasma hanging and kwin hanging?
Comment 24 Patrick Silva 2023-01-21 12:37:59 UTC
(In reply to Bernd Steinhauser from comment #23)
> (In reply to Patrick Silva from comment #21)
> > On my Arch Linux sometimes Plasma hangs while pacman installs updates, not
> > kwin.
> > I use ext4 file system. bug 463681 marked as duplicated is about Plasma.
> 
> How can you distinguish between plasma hanging and kwin hanging?

When Plasma hangs but kwin does not, clicks on Plasma panel and on desktop have no effect, but I can move the mouse pointer and switch between windows by pressing alt+tab.
Comment 25 Bernd Steinhauser 2023-01-21 13:54:04 UTC
Ah ok. That's not the case for me, I can't switch windows the whole thing hangs.
Comment 26 Patrick Silva 2023-01-23 17:32:32 UTC
Yesterday kwin_wayland and Plasma hung separately while pacman was installing 24 updates on my Arch Linux installed in an EXT4 partition. Definetely this bug is not BTRFS specific.
Comment 27 Nate Graham 2023-01-23 17:44:04 UTC
I've been experiencing Plasma hangs too, but unrelated to this icon issue. I've seen two recently:

#10 HistoryModel::insert(QSharedPointer<HistoryItem>) (this=0x2d40fa0, item=...)
    at /home/nate/kde/src/plasma-workspace/klipper/historymodel.cpp:135
#11 0x00007f049c8bb542 in History::insert(QSharedPointer<HistoryItem>)
    (this=<optimized out>, item=...) at /home/nate/kde/src/plasma-workspace/klipper/history.cpp:95
#12 0x00007f049c8a71fc in Klipper::applyClipChanges(QMimeData const*)
     (this=this@entry=0x286f660, clipData=clipData@entry=0x6fb1b40)
    at /home/nate/kde/src/plasma-workspace/klipper/klipper.cpp:687

and also

#11 0x00007f9310242500 in VolumeFeedback::play(unsigned int) (this=this@entry=0x26ca090, sinkIndex=<optimized out>) at /home/nate/kde/src/plasma-pa/src/qml/volumefeedback.cpp:54
#12 0x00007f9310210aef in VolumeFeedback::qt_static_metacall(QObject*, QMetaObject::Call, int, void**) (_c=QMetaObject::InvokeMetaMethod, _id=0, _a=0x7ffd4ed7cfe0, _o=0x26ca090) at /home/nate/kde/build/plasma-pa/src/plasma-volume-declarative_autogen/CCBC4FUR7J/moc_volumefeedback.cpp:77
#13 VolumeFeedback::qt_static_metacall(QObject*, QMetaObject::Call, int, void**) (_a=0x7ffd4ed7cfe0, _id=0, _c=QMetaObject::InvokeMetaMethod, _o=0x26ca090) at /home/nate/kde/build/plasma-pa/src/plasma-volume-declarative_autogen/CCBC4FUR7J/moc_volumefeedback.cpp:71


So let's not assume all hangs are the same. If Plasma is hanging, you can use Konsole to get a backtrace like this:

$ gdb attach $(pidof plasmashell)
$ bt

Then please file it as a new bug. Thanks!
Comment 28 Wyatt Childers 2023-01-23 17:48:57 UTC
So, as an update to this, I've filed an upstream kernel bug against btrfs (https://bugzilla.kernel.org/show_bug.cgi?id=216961).

As the original reporter (and someone who has observed several other hangs related to with btrfs), I believe at least a major factor is a regression in Kernel 6.0 with btrfs's scheduling. I think ideally, kwin shouldn't directly do much of anything that blocks on IO (as slower drives and bugs of this nature can severely impact user experience), so this should still be fixed. However, it definitely seems to be exacerbated by the severity of extremely long periods of disk sleep for "trivial" IO operations.
Comment 29 David Redondo 2023-01-24 11:10:05 UTC
Git commit b6a3e25e81014110f1e0f470832006cc60cbc86d by David Redondo.
Committed on 24/01/2023 at 09:52.
Pushed by davidre into branch 'master'.

Only recreate icons if an icon dir changed

Other paths that we are watching can end in "icons"
as we are watching subdirs. Make sure to not take the wrong code
path and only do an icon change if one our watched icon dirs
changes.
FIXED-IN:5.103

M  +1    -1    src/kded.cpp

https://invent.kde.org/frameworks/kded/commit/b6a3e25e81014110f1e0f470832006cc60cbc86d
Comment 30 David Redondo 2023-01-24 11:12:09 UTC
Git commit 23cb03267ae1b1cdb8a75be1992d2fbf122aaa6e by David Redondo.
Committed on 24/01/2023 at 11:12.
Pushed by davidre into branch 'cherry-pick-b6a3e25e'.

Only recreate icons if an icon dir changed

Other paths that we are watching can end in "icons"
as we are watching subdirs. Make sure to not take the wrong code
path and only do an icon change if one our watched icon dirs
changes.
FIXED-IN:5.103


(cherry picked from commit b6a3e25e81014110f1e0f470832006cc60cbc86d)

M  +1    -1    src/kded.cpp

https://invent.kde.org/frameworks/kded/commit/23cb03267ae1b1cdb8a75be1992d2fbf122aaa6e
Comment 31 Nate Graham 2023-01-27 19:54:55 UTC
*** Bug 464867 has been marked as a duplicate of this bug. ***
Comment 32 Nicolas Fella 2023-02-10 00:05:40 UTC
*** Bug 465525 has been marked as a duplicate of this bug. ***
Comment 33 Wyatt Childers 2023-02-22 16:12:59 UTC
Cross posting here that if you're seeing this problem and your IO scheduler is "bfq", as a workaround you can switch to "kyber", "none", or "mq-deadline".

This should resolve most of the issue until you have 5.103 in your distro (and possibly some other issues you may be experiencing). Pop!_OS uses kyber, and I've seen much better results with it than "mq-deadline" personally; none was also very promising in my testing but I've opted for following what Pop!_OS did.
Comment 34 Patrick Silva 2023-03-02 20:25:14 UTC
I have just installed 80 packages with pacman on Arch Linux and this bug happened again, even the mouse pointer froze.
Can we reopen?