Bug 247204

Summary: KIO::listRecursive() spawns excessive number of kio_file slaves
Product: [Unmaintained] kio Reporter: Maciej Mrozowski <reavertm>
Component: fileAssignee: David Faure <faure>
Status: RESOLVED FIXED    
Severity: normal CC: adawit
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Gentoo Packages   
OS: Linux   
Latest Commit: Version Fixed In: 4.7.1
Sentry Crash Report:

Description Maciej Mrozowski 2010-08-10 04:55:11 UTC
Version:           unspecified (using KDE 4.5.0) 
OS:                Linux

KIO::listRecursive() doesn't seem to provide any constraint for the numer of spawned slaves. It results in an excessive forking when processing large directory with multiple levels and many files.

Reproducible: Always

Steps to Reproduce:
1. Open kfind, search for some phrase in files (in some big directory like kdelibs source code)
2. Observe number of kio_file slaves created at the same time.

Actual Results:  
Multiple (possibly infinite, only depending on processed directory) slaves are spawned causing heavy I/O.

Expected Results:  
Number of slaves should be deterministic and what's the most important - should not cause heavy I/O.

Qt-4.6.3, KDE SC 4.5 (svn branch), Also reported by some KDE SC 4.4.5 user.
Comment 1 Dawit Alemayehu 2011-08-10 20:16:52 UTC
You can change this behavior for yourself. Simply modify the number maximum file ioslaves that can be spawned from the current maximum of 50 to whatever value is acceptable to you in file.protocol. The property you want to change is the "maxInstances=" property.

We can cut this value down to a lower value by default, but then people would complain it is taking too long to list a directories in multiple tabs. That is precisely why the default value is set to something large.
Comment 2 Maciej Mrozowski 2011-08-12 17:44:55 UTC
What kind of maximum is this 50? Is it maximum number of ioslaves that can be spawned for currently processed directory (unlikely, ListJobPrivate::newJob or ListJobPrivate::slotListEntries doesn't indicate imposing any limits) or total (user session global) maximum spawnable ioslaves?

If the former, then there's no maxInstances value that could provide deterministic I/O utilization. If the latter, then.... does it mean it's not possible to traverse >maxInstances directory levels deep then?

> We can cut this value down to a lower value by default, but then people would
> complain it is taking too long to list a directories in multiple tabs.
> precisely why the default value is set to something large.

Uhmm, listRecursive deosn't work in environment that is CPU bound. It certainly does in one that's I/O bound. That being said, happily spawning excessive number of threads/processes is not going to help.
And what it really does it actually renders the system barely responsive.

Dolphin for that matter doesn't seem to be using listRecursive for "everyday's work" (only in ApplyViewPropsJob).

The root cause is not maxInstances, but uncontrolled (sic! because depending exclusively on input data = MAX(maxInstances, number_of_subdirectories_in_input path) ioslave spam by listRecursive() and clean, simple but painfully generic implementation of ListJobPrivate.

Now, if only there was a way not to spawn ioslaves only to process subdirectories (and just process them within current job/thread or to have some private fixed job pool)..
Comment 3 Dawit Alemayehu 2011-08-12 20:25:12 UTC
(In reply to comment #2)
> What kind of maximum is this 50? Is it maximum number of ioslaves that can be
> spawned for currently processed directory (unlikely, ListJobPrivate::newJob or
> ListJobPrivate::slotListEntries doesn't indicate imposing any limits) or total
> (user session global) maximum spawnable ioslaves?
> 
> If the former, then there's no maxInstances value that could provide
> deterministic I/O utilization. If the latter, then.... does it mean it's not
> possible to traverse >maxInstances directory levels deep then?

Neither. The maximum instance property applies to KIO::Scheduler. Every app that uses KIO gets its own single scheduler and as such the limit is per application.

> > We can cut this value down to a lower value by default, but then people would
> > complain it is taking too long to list a directories in multiple tabs.
> > precisely why the default value is set to something large.
> 
> Uhmm, listRecursive deosn't work in environment that is CPU bound. It certainly
> does in one that's I/O bound. That being said, happily spawning excessive
> number of threads/processes is not going to help.
> And what it really does it actually renders the system barely responsive.
> 
> Dolphin for that matter doesn't seem to be using listRecursive for "everyday's
> work" (only in ApplyViewPropsJob).
> 
> The root cause is not maxInstances, but uncontrolled (sic! because depending
> exclusively on input data = MAX(maxInstances, number_of_subdirectories_in_input
> path) ioslave spam by listRecursive() and clean, simple but painfully generic
> implementation of ListJobPrivate.

Unless the listRecursive does job management by itself, which it does not, then the  "data = MAX(maxInstances, number_of_subdirectories_in_input_path)" will NEVER happen. The scheduler is the only thing that can create new ioslaves. 

But, do not take my word for it. Simply change the value and test this for yourself. I did before I responded here. Exactly 50 ioslaves, the maximum allowed, are spawned when I perform the type of search you mentioned above. Love it or hate it, by design KIO is mostly I/O or memory bound unless something is horribly wrong with one of the ioslaves. So you have the flexablity to configure that. Simply change the value and see what happens. 

> Now, if only there was a way not to spawn ioslaves only to process
> subdirectories (and just process them within current job/thread or to have some
> private fixed job pool)..

There is as I have already mentioned. I absolutely have no clue why you think that is not the case. You have the power to control the number of ioslaves that can be spawned by a given application. Granted how to do that is neither well documented nor easy to find. Still, it is a matter of reducing a single value in a text file.
Comment 4 Maciej Mrozowski 2011-08-12 23:07:10 UTC
> But, do not take my word for it. Simply change the value and test this for
> yourself. I did before I responded here. Exactly 50 ioslaves, the maximum
> allowed, are spawned when I perform the type of search you mentioned above.
> Love it or hate it, by design KIO is mostly I/O or memory bound unless
> something is horribly wrong with one of the ioslaves. So you have the
> flexablity to configure that. Simply change the value and see what happens. 

Indeed, reducing maxInstances to 2 (or even 1) does wonders in kfind case.

I'd like to go back to this for a while:

> We can cut this value down to a lower value by default, but then people would
> complain it is taking too long to list a directories in multiple tabs. That is
> precisely why the default value is set to something large.

What directory listing in multiple tabs do you have in mind here? If dolphin, then I've already verified it does NOT use listRecursive so nobody would notice.

Especially when maxInstances is per application, I think 50 is waaay to huge to be left as default. I mean, come on, there is a bunch of services/components hogging I/O already (nepomukservices, akonadi, whole KConfig thing).

Try maxInstances=2 for yourself.
Comment 5 Dawit Alemayehu 2011-08-13 06:07:19 UTC
Git commit d83dd99ae1d147eda1949f1aa6c1810f4c759caa by Dawit Alemayehu.
Committed on 13/08/2011 at 07:18.
Pushed by adawit into branch 'KDE/4.7'.

Reduce the number of file ioslaves that can be spawned per app to a reasonable
value. This was changed in rev 1eef3d0 and was forgotten to be reverted back
once the new scheduler landed in KDE 4.5.

BUG: 247204

M  +1    -1    kioslave/file/file.protocol

http://commits.kde.org/kdelibs/d83dd99ae1d147eda1949f1aa6c1810f4c759caa
Comment 6 Dawit Alemayehu 2011-08-16 14:20:27 UTC
Git commit 7b060a830c1a7db48dfccc6cecb63b62e3892e6c by Dawit Alemayehu.
Committed on 13/08/2011 at 07:18.
Pushed by adawit into branch 'frameworks'.

Reduce the number of file ioslaves that can be spawned per app to a reasonable
value. This was changed in rev 1eef3d0 and was forgotten to be reverted back
once the new scheduler landed in KDE 4.5.

BUG: 247204

M  +1    -1    kioslave/file/file.protocol

http://commits.kde.org/kdelibs/7b060a830c1a7db48dfccc6cecb63b62e3892e6c