Bug 392768

Summary: KIO::AccessManager keeps ioslave spinning indefinitely
Product: [Frameworks and Libraries] frameworks-kio Reporter: Kai Uwe Broulik <kde>
Component: generalAssignee: David Faure <faure>
Status: RESOLVED FIXED    
Severity: normal CC: andysem, jgoodwin, jkillius, kdelibs-bugs, kfunk, mdione, nate
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Linux   
URL: https://phabricator.kde.org/D26364
Latest Commit: Version Fixed In: 5.67
Attachments: Flamegraph of kio http.so in affected state
Patch to make KIO process more events.
Patch to always set nextTimeoutMsecs to at least 1s.
Invert wait duration subtraction

Description Kai Uwe Broulik 2018-04-05 18:41:34 UTC
When using KIO AccessManager in QML, an Image loading from http results in http.so lingering around consuming 4-8% CPU on one core indefinitely, significantly reducing battery life.

    Image {
        source: "https://www.kde.org/content/home/plasma-5-12-lts-logo.jpg"
    }

is enough to trigger this
Comment 1 Julien Goodwin 2018-07-17 11:35:18 UTC
Created attachment 113986 [details]
Flamegraph of kio http.so in affected state
Comment 2 Julien Goodwin 2018-07-17 11:43:07 UTC
This has been affecting me for a few months (I run choqok & akgregator, both of which make a heap of background HTTP requests), not sure if it's particularly helpful, but after figuring out the magic to get a sane profile out of linux perf it shows the leaked time is spent in the QCoreApplication::sendPostedEvents call at the end of the main loop in SlaveBase::dispatchLoop.

Some quick reading of the code (as a non-QT dev) makes me wonder if the postEventList is getting filled with messages that, for whatever reason, *can't* be delivered.
Comment 3 Julien Goodwin 2019-05-09 03:00:02 UTC
Created attachment 119923 [details]
Patch to make KIO process more events.

I ran a copy of KIO with the attached patch for a while, and while it certainly helped reduce the leak, it didn't fully resolve the issue.
Comment 4 Nate Graham 2019-05-09 03:12:44 UTC
It might be worth posting your patch for comment on https://phabricator.kde.org, even if you know it doesn't fully resolve the issue. Over there, it's more likely to be seen by developers, who can offer you suggestions for improvement.

See https://community.kde.org/Infrastructure/Phabricator#Posting_a_Patch_using_the_website
Comment 5 Kevin Funk 2019-12-31 11:25:18 UTC
*** Bug 411047 has been marked as a duplicate of this bug. ***
Comment 6 Kevin Funk 2019-12-31 11:25:56 UTC
Also see my observations in https://bugs.kde.org/show_bug.cgi?id=411047#c6
Comment 7 Julien Goodwin 2020-01-01 10:07:23 UTC
TL;DR of Kevin's notes in 411047#c6:

"We're apparently excessively calling QAbstractSocket::waitForReadyRead(msecs=1) in a loop which causes high CPU usage." ... This is called by ConnectionBackend::waitForIncomingTask which just passes down the timeout value from its caller.

That caller is SlaveBase::dispatchLoop, which gets its value from:
ms = qMax<int>(d->nextTimeout.elapsed() - d->nextTimeoutMsecs, 1);

Which should be sane enough assuming nextTimeoutMsecs is non-zero (although personally I think the way the timeouts are done here is silly, a read if available with no wait would be far more sensible with any sleeping being higher up)

nextTimeoutMsecs is only ever set if SlaveBase::setTimeoutSpecialCommand is called, otherwise it's zero; notably if setTimeoutSpecialCommand is called with *exactly* zero it gets set to 1000, positive inputs do the obvious, but negative (to cancel) do not do similarly.

setTimeoutSpecialCommand is called from HTTPProtocol::httpClose & HTTPProtocol::httpCloseConnection

So if somehow they're not called at the end of a request that would leak connections?
Comment 8 Julien Goodwin 2020-01-02 07:59:30 UTC
Created attachment 124841 [details]
Patch to always set nextTimeoutMsecs to at least 1s.

I did a quick build (of the current Debian package), with just the trivial attached patch that should ensure nextTimeoutMsecs is always at least 1 second.

This resulted in no notable changes with my usage, but did point me to another probable error, doing a fresh build at the moment.
Comment 9 Julien Goodwin 2020-01-02 08:17:07 UTC
Created attachment 124842 [details]
Invert wait duration subtraction

Will need to let it soak to be sure, but it looks like this is at least one major part of the fix, I even called out it in comment 7, but missed the error:

ms = qMax<int>(d->nextTimeout.elapsed() - d->nextTimeoutMsecs, 1);

Will mean the sleep is for as long as the timer has run *minus* the intended duration, so if nextTimeoutMsecs is ever set, this becomes very negative, and 1ms is the result.

Inverting the subtraction:
ms = qMax<int>(d->nextTimeoutMsecs - d->nextTimeout.elapsed(), 1);

Means sleeping for the remaining time, and so far my CPU seems much happier, with my KIO-HTTP using apps looking fine.
Comment 10 Kevin Funk 2020-01-02 12:56:05 UTC
I think the last patch makes sense. Looks like the subtraction was the wrong way around.

Would you mind putting that patch up to Phabricator so it can be reviewed? Or shall I do it?

See: https://community.kde.org/Infrastructure/Phabricator
Comment 11 Julien Goodwin 2020-01-02 13:06:03 UTC
Kevin - 
I'm not set up for Phabricator, if you're happy to pass the patch on it'd save me setting up an account just for that.
Comment 12 Kevin Funk 2020-01-02 13:25:41 UTC
Patch @ https://phabricator.kde.org/D26364
Comment 13 Kevin Funk 2020-01-02 13:31:02 UTC
*** Bug 400155 has been marked as a duplicate of this bug. ***
Comment 14 Jan Killius 2020-01-02 17:37:20 UTC
The patch fixed the problem with cover art in the media player from spotify.
Comment 15 Kevin Funk 2020-01-09 14:18:09 UTC
Git commit 27271405f4085bd8a398d41a75ac52cb4f857efd by Kevin Funk, on behalf of Julien Goodwin.
Committed on 09/01/2020 at 14:17.
Pushed by kfunk into branch 'master'.

SlaveBase::dispatchLoop: Fix timeout calculation

Summary:
Old version of the code:
  ms = qMax<int>(d->nextTimeout.elapsed() - d->nextTimeoutMsecs, 1);

... will mean the sleep is for as long as the timer has run *minus* the
intended duration, so if nextTimeoutMsecs is ever set and the timer just
started this becomes very negative, and 1ms is the result.

Inverting the subtraction:
  ms = qMax<int>(d->nextTimeoutMsecs - d->nextTimeout.elapsed(), 1);

Means sleeping for the remaining time, and so far my CPU seems much
happier, with my KIO-HTTP using apps looking fine.

Reviewers: chinmoyr, davidedmundson, dfaure, broulik

Reviewed By: dfaure

Subscribers: ngraham, kde-frameworks-devel

Tags: #frameworks

Differential Revision: https://phabricator.kde.org/D26364

M  +1    -1    src/core/slavebase.cpp

https://commits.kde.org/kio/27271405f4085bd8a398d41a75ac52cb4f857efd
Comment 16 Kai Uwe Broulik 2020-02-03 08:59:42 UTC
*** Bug 417020 has been marked as a duplicate of this bug. ***