Bug 157266

Summary: It takes very long to load directories with many files in it recursively
Product: [Applications] dolphin Reporter: Frederik Schwarzer <schwarzer>
Component: generalAssignee: Peter Penz <peter.penz19>
Status: RESOLVED FIXED    
Severity: normal CC: faure, finex
Priority: NOR    
Version: 16.12.2   
Target Milestone: ---   
Platform: Debian testing   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: kdirmodel.cpp.diff

Description Frederik Schwarzer 2008-02-06 16:57:15 UTC
Version:           1.0 using KDE 4.0.1 (using KDE 3.5.8)
Installed from:    Debian testing/unstable Packages

I have the following directory structure (simplified obviously ;)):

foo/a/x/1
foo/a/x/2
foo/a/x/3
foo/a/x/4
foo/a/y/1
foo/a/y/2
foo/a/y/3
foo/a/y/4
foo/a/z/1
foo/a/z/2
foo/a/z/3
foo/a/z/4
...

So, "a" contains about 20 directories, which itselves contain about 20 directories.
Every one of those dirs (the numbered ones) has about 20000 to 50000 files in it.

It takes long to load those dirs containing the files. But that's not surprising.
The strange thing is that it takes even longer to load the dir "a".
I mean, loading the dirs on the "x", "y", "z" level counts the entries in the numbered dirs, but what does loading "a" take so long?
Loading "foo" is all right which besides "a" contains a bunch of other "normal" directories and files.
Comment 1 Peter Penz 2008-02-06 17:28:10 UTC
Thanks for the detailed description, I was not aware about this performance issue. I've added David Faure to CC, as I cannot explain why the number of sub directories influences the loading time of the directory "a" containing only 20 directories. Maybe the part to blame is that for each directory the "Sizes" column is filled with the number of sub items inside this directory. My guess would be that this information should already be part of the directory node, but maybe some more expensive things are done. I'll check this when I find some time doing some Dolphin work this week, but maybe David has already a hint.
Comment 2 Peter Penz 2008-02-07 09:18:31 UTC
@David: I did some tests and the performance issue only occurs if the "Size" column is shown (e. g. in the details view). I had a look at KDirModel::data() and I guess the bottleneck is in the ChildCountRole case:

case ChildCountRole:
if (!item.isDir())
    return ChildCountUnknown;
else {
    KDirModelDirNode* dirNode = static_cast<KDirModelDirNode *>(node);
    int count = dirNode->childCount();
    if (count == ChildCountUnknown && item.isReadable()) {
        const QString path = item.localPath();
        if (!path.isEmpty()) {
            QDir dir(path);
            count = dir.entryList(QDir::AllEntries|QDir::NoDotAndDotDot|QDir::System).count();
            //kDebug(7008) << "child count for " << path << ":" << count;
            dirNode->setChildCount(count);
        }
    }
    return count;
}

QDir::entryList() returns a string-list of all (!) directory entries and we just need the directory count... Isn't there a faster way getting the number of directories (e.g. ls -l for sure does not take care about this)? My guess is that QDir::count() should be fixed, but maybe there is a straight forward way to bypass this performance issue at least on Unix systems by directly asking the filesystem instead of using QDir?


Comment 3 David Faure 2008-02-07 14:26:02 UTC
Does this work faster?
(I tested that it works, but I'll let you check if it helps in terms of performance)


Created an attachment (id=23462)
kdirmodel.cpp.diff
Comment 4 Peter Penz 2008-02-07 14:37:26 UTC
Thanks David for the patch. I don't have time until the weekend for testing it but I'll respond as soon as I've some numbers...
Comment 5 Peter Penz 2008-02-08 17:03:03 UTC
David, I've tested you're patch and it makes it a lot faster. Previously it took around 2 seconds until my test folders appeared, now the folder is shown immediately (at least I can see no performance difference in comparison with a folder having no sub directories).
Comment 6 David Faure 2008-02-12 14:13:15 UTC
SVN commit 774075 by dfaure:

Much faster implementation of ChildCountRole. Peter tested it and said:
"Previously it took around 2 seconds until my test folders appeared, now the folder is shown immediately (at least I can see no performance difference in comparison with a folder having no sub directories)."
BUG: 157266


 M  +19 -0     kdirmodel.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=774075