Bug 441077 - Plasmashell panels may freeze, usually with automounted NFS/SMB mounts
Summary: Plasmashell panels may freeze, usually with automounted NFS/SMB mounts
Status: CONFIRMED
Alias: None
Product: plasmashell
Classification: Plasma
Component: generic-performance (show other bugs)
Version: 5.21.5
Platform: Other Linux
: NOR normal
Target Milestone: 1.0
Assignee: Plasma Bugs List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-17 10:15 UTC by Kai Krakow
Modified: 2023-09-25 19:34 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
thread apply all bt (74.01 KB, text/plain)
2021-08-17 10:15 UTC, Kai Krakow
Details
thread apply all bt full (127.07 KB, text/plain)
2021-08-17 10:16 UTC, Kai Krakow
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kai Krakow 2021-08-17 10:15:49 UTC
Created attachment 140790 [details]
thread apply all bt

SUMMARY
Plasmashell tends to hang/completely freeze when a mounted network file system temporarily does not respond. Most of the times, the panels will recover after some minutes but sometimes they don't.

I've now seen a freeze which didn't recover and I had to SIGKILL plasmashell to restart it, even `--replace` did not work. I'm not sure if a mounted file system was involved this time (see Additional Information below).

Take note that the network mounts are completely unrelated, there's nothing on the panel that even loads data from it.

In earlier versions this was also caused by plasma widgets loading data from network while the internet connection temporarily stalled. I've since removed all widgets that load data from network (like the weather widget). But I believe this incident may be caused by the disk usage warning plugin of plasma when it tries to access disk usage data from a non-responding mount.

Conclusion: Plasma should not block and maybe wrap a timeout handler around queries that may potentially stall due to external reasons. However, this incident may be different.

A plasma dev may have better insights. I'll attach a short and full gdb backtrace I was able to fetch from the hanging pid.

Possible bug reports covering the mount-related issue but probably not a duplicate of this:
#413110, #416972, #272361

STEPS TO REPRODUCE
0. The backtrace may not be about this particular hang:
1. Use network mounts somewhere below your home directory
2. Cut the network connection to stall access to these mounts
3. Observe plasmashell freezing and recovering minutes later

OBSERVED RESULT
Plasma panels stop responding to clicks and freeze with whatever content they did currently render.

EXPECTED RESULT
Plasma should detect timeouts and recover in one or another way, as a last resort it could kill itself and restart - which helped in this particular case: I SIGKILLed it and restarted it.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Gentoo Linux
(available in About System)
KDE Plasma Version: 5.21.5
KDE Frameworks Version: 5.82.0
Qt Version: 5.15.2

ADDITIONAL INFORMATION
This particular freeze may be related to this dmesg entry:

[686470.331086] BUG: unable to handle page fault for address: 000000020ee80000
[686470.331091] #PF: supervisor write access in kernel mode
[686470.331092] #PF: error_code(0x0002) - not-present page
[686470.331093] PGD 0 P4D 0
[686470.331096] Oops: 0002 [#1] PREEMPT SMP
[686470.331098] CPU: 1 PID: 1919 Comm: QSGRenderThread Not tainted 5.10.57-gentoo #1
Comment 1 Kai Krakow 2021-08-17 10:16:20 UTC
Created attachment 140791 [details]
thread apply all bt full
Comment 2 Nate Graham 2021-08-17 16:12:28 UTC
In general, this is why KIO exists: to provide asynchronous, non-blocking access to flaky network resources. When you bypass KIO and mount them directly, you're undoing that protection.

What kinds of things have you manually mounted and why?
Comment 3 Bug Janitor Service 2021-10-16 08:42:36 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/plasma-desktop/-/merge_requests/603
Comment 4 Kai Krakow 2021-10-16 10:39:00 UTC
(In reply to Nate Graham from comment #2)
> In general, this is why KIO exists: to provide asynchronous, non-blocking
> access to flaky network resources. When you bypass KIO and mount them
> directly, you're undoing that protection.
> 
> What kinds of things have you manually mounted and why?

The mount in question is a samba share mounted over VPN when I work remotely at home.

The problem with KIO is that it is not transparently available to all software, i.e. when opening a document with LibreOffice, it will be copied to a temporary directory, then copied back on save/quit. This removes the ability of LibreOffice to lock the file and detect concurrent changes of the source file. And it also breaks relation between the original file path and the path opened in the application: This is not what I expect to be transparency.

Also, non-KDE programs cannot even parse the paths: Sometimes, KIO paths become directly passed to the program instead of copying the file to a local path first.

CLI utilities cannot even be used to navigate KIO mounts.

KIO should probably become a fuse module - which would then probably just introduce this exact reported behavior which KIO tries to avoid.

KIO is nice if you exclusively only use KDE software - but that's quite unrealistic in practice. Also, it falls apart once you drop to the CLI level, which is what a lot of my workflow consists of.

I'm actually using KIO for things like managing files over sftp remotely via Dolphin - and it works great then. But its field of use is very limited to me.

Maybe comment #3 solves it for me.
Comment 5 Nate Graham 2021-10-18 14:24:40 UTC
That's what kio-fuse was meant to solve. Do you have it installed?
Comment 6 Méven Car 2021-10-18 16:36:33 UTC
Git commit a4c711a411f47e11c5327efa0dd40c12b26875e5 by Méven Car, on behalf of Fushan Wen.
Committed on 18/10/2021 at 16:34.
Pushed by meven into branch 'master'.

taskmanager: Use SkipMimeTypeFromContent flag when creating KFileItem

This prevents plasmashell from freezing at opening the context menu
when there is no network and there are files on a network mount in
"Recent Files" section.
Related: bug 443465, bug 406110

M  +2    -6    applets/taskmanager/plugin/backend.cpp

https://invent.kde.org/plasma/plasma-desktop/commit/a4c711a411f47e11c5327efa0dd40c12b26875e5
Comment 7 Kai Krakow 2021-10-18 21:20:35 UTC
(In reply to Nate Graham from comment #5)
> That's what kio-fuse was meant to solve. Do you have it installed?

Ah ok, didn't know how it worked - and it was eventually kicked from the system previously by some conflict. I now reinstalled it and reloaded my systemd daemon.

Observation: Opening files for non-KIO aware programs spawns a fuse mount. Looks like it doesn't do that for KIO-aware programs (at least I see no additional folders created in the mount directory).

Let's see how I can use that for me.

But I will still depend on NFS mounts from the local network, e.g. by backup repository will be mounted via an automounter (daily full system backup with borg). Usually, the NAS is available 24x7 but sometimes it will restart for updates - and that's when the plasmashell rarely but eventually blocks. Why does it look for network mounts that actually have no business with my user account, or does it? Any way to debug this when it happens?
Comment 8 Nate Graham 2021-10-19 18:36:09 UTC
I don't know, sorry.

If you configure Borg with a nfs:// URL, hopefully kio-fuse will notice and automatically mount a fuse mount underneath it and everything will be nice and transparent.

That should work. If it doesn't please file a bug report to kio-fuse. :)
Comment 9 Kai Krakow 2021-10-19 18:42:47 UTC
I'm not sure how that could even work.

(a) it runs from a different session context, it's running from systemd system service while kio-fuse is a user session dbus service

(b) this would imply that I could use `ls nfs://serverip/` which I obviously cannot (I tried, spoiler: doesn't work)

Maybe that could be solved if I knew how KDE/KIO decides whether to use KIO and when to fall back to KIO fuse. How does this heuristic work?
Comment 10 Kai Krakow 2021-10-19 18:46:38 UTC
OTOH, I could try migrating to a borg client/server model instead of opening the repository via filesystem directly. This also decouples some security concerns as the borg service limits access patterns to those needed for repository access.
Comment 11 Riccardo Robecchi 2021-12-13 11:07:48 UTC
I can confirm this bug, even though in my case Plasmashell hangs and never recovers. Even restarting it has no effect other than making Plasma freeze as soon as it starts loading, leaving me with a non-functional desktop. I can reproduce it if a mounted NFS share becomes not available.
Comment 12 Kai Krakow 2023-05-02 18:18:47 UTC
I partially found automounts triggering the problem. With automounts enabled, right clicking on dolphin window icons or starting konsole (probably every app that is somehow involved in checking mount point stats) can stall either plasma (when a context menu is displayed) or stall konsole startup for multiple seconds up to a minute.

Interestingly, just using "cd" to switch to a directory with automount point and running "ls" almost instantly returns and shows the directory.

This affects both NFS and local automounts for me but also remote filesystems mounted via kio (and the latter seem to contribute a lot more to the stalls).

Once NFS automounts are mounted, the system does no longer stall konsole, kio is eventually cached by then but the dolphin window icon context menu is still affected and blocks plasma while stalled.

On a second system which is configured almost identically, the same behavior cannot be observed or it is much less pronounced.