Bug 477805 - Service start timeout is ridiculously short
Summary: Service start timeout is ridiculously short
Status: REOPENED
Alias: None
Product: policykit-kde-agent-1
Classification: Plasma
Component: general (show other bugs)
Version: 5.27.9
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: David Edmundson
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-30 22:36 UTC by Stefan Brüns
Modified: 2024-01-08 15:49 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Brüns 2023-11-30 22:36:20 UTC
SUMMARY

The start timeout (walltime) is 5 seconds, which is significantly too short in an emulated virtual machine, especially during session start when many services are causing CPU and disk load.

STEPS TO REPRODUCE
1. Create an emulated virtual machine, e.g. aarch64 on x86_64
2. (install/use a live distribution)
3. Start a Plasma session

OBSERVED RESULT

polkit-kde-agent-1 is starting and killed after 5 seconds, started again, killed again, etc.

This can happen 100 times in a row. Sometimes other plasma services run into the same problem, exaggerating the problem.

EXPECTED RESULT

All services are just started once, as fast or slow as the system permits.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version:  5.27.9
KDE Frameworks Version: 5.112.0
Qt Version: 5.15

ADDITIONAL INFORMATION

https://invent.kde.org/plasma/plasma-workspace/-/wikis/Plasma-and-the-systemd-boot

> The default timeout between quit signal and killing is ridiculously long

The "quit" timeout should be set with "TimeoutStopSec", not with "TimeoutSec".  Gnome services have this correctly set, all Plasma services have this wrong.
Comment 1 Stefan Brüns 2023-11-30 23:26:12 UTC
Also see https://bbs.archlinux.org/viewtopic.php?id=289979
Comment 2 Bug Janitor Service 2023-12-01 15:50:06 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/polkit-kde-agent-1/-/merge_requests/35
Comment 3 Stefan Brüns 2023-12-04 17:09:32 UTC
Git commit 094b64cf3d50d2b0cc74da29c59e5829cf5e9747 by Stefan Brüns.
Committed on 04/12/2023 at 18:09.
Pushed by bruns into branch 'master'.

Extend service start timeout

On a slow machine the service may take longer than 5 seconds (walltime) to start,
especially during session start when multiple processes are competing for
CPU time and I/O.

The original reasoning for the timeout is to limit the time to quit the process
when the session is shut down, but that can be set with TimeoutStopSec.

Remove the start timeout, as it os questionable if is useful at all. It only
helps if there is some deadlock during startup which would be cured by a
fresh start, otherwise it is just consuming CPU time again and again.

For comparison, Gnome does not use Timeout[Start]Sec at all, but
only TimeoutStopSec.

M  +1    -1    plasma-polkit-agent.service.in

https://invent.kde.org/plasma/polkit-kde-agent-1/-/commit/094b64cf3d50d2b0cc74da29c59e5829cf5e9747
Comment 4 Bug Janitor Service 2023-12-04 17:15:30 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/powerdevil/-/merge_requests/284
Comment 5 Stefan Brüns 2023-12-10 04:48:10 UTC
Git commit 2ae70fd9c55fc85275c81ea96f5043a793628f79 by Stefan Brüns.
Committed on 10/12/2023 at 05:46.
Pushed by bruns into branch 'master'.

Remove service start timeout

On a slow machine the service may take longer than 5 seconds (walltime) to start,
especially during session start when multiple processes are competing for
CPU time and I/O.

The original reasoning for the timeout is to limit the time to quit the process
when the session is shut down, but that can be set with TimeoutStopSec.

Whether the start timeout is useful at all is questionable. It only helps if
there is some deadlock during startup which would be cured by a fresh
start, otherwise it is just consuming CPU time again and again.

For comparision, Gnome does not use Timeout[Start]Sec at all, but
only TimeoutStopSec.

M  +1    -1    daemon/plasma-powerdevil.service.in

https://invent.kde.org/plasma/powerdevil/-/commit/2ae70fd9c55fc85275c81ea96f5043a793628f79
Comment 6 Kevin Kofler 2024-01-08 15:48:04 UTC
Unfortunately, at least the PowerDevil fix (see comment #5) was not backported to 5.27, and I have just hit this issue on my PinePhone with PowerDevil. The symptoms:
* I tried to reboot the PinePhone several times; each time, the screen brightness ruler would do nothing (whereas it used to work just hours before, without me having upgraded any package in the meantime, though the trigger might have been the package upgrades from last week).
* So I tried the org.kde.Solid.PowerManagement.Actions.BrightnessControl D-Bus interface through the qdbus CLI. That worked, magically. Huh?
* So I looked into journalctl -b, and noticed PowerDevil was failing to start up due to the 5 second timeout, then getting restarted several times, and then at some point starting to run. So when I checked systemctl status, it was actually running, but the UI (the Plasma Mobile quicksettings drawer) was not aware of that, it had failed to connect to PowerDevil on startup and did not retry (which is arguably a bug in the UI and should probably be reported separately).

So I copied the unit file to /etc/systemd/user/plasma-powerdevil.service and increased the timeout to 10 seconds, and everything started working again. Then I wanted to file a bug, and the duplicate search found this one. I have not yet tested the upstream fix (the one that went into master), but I expect that to work, too.

So can we please backport the PowerDevil change to 5.27?
Comment 7 Kevin Kofler 2024-01-08 15:49:56 UTC
Now that I think of it, I had already seen that issue at times in the past, but back then, a reboot was enough to fix it. This time, several reboots did not help, only increasing the timeout did.