Summary: | folders indexed on every start that did not change | ||
---|---|---|---|
Product: | nepomuk | Reporter: | S. Burmeister <sven.burmeister> |
Component: | general | Assignee: | Sebastian Trueg <sebastian> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | l.lunak, mitchell, mutlu_inek, tassilo, trueg, wstephenson |
Priority: | NOR | ||
Version: | 4.1 | ||
Target Milestone: | --- | ||
Platform: | openSUSE | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: |
Description
S. Burmeister
2009-01-27 08:08:37 UTC
I also experience high I/O load after KDE startup. Sometimes Nepomuk (with strigi enabled) browses folders which had recent changes, very often, however, folders are crawled which I have not touched in years. Their data should have been indexed long ago. On my machine, this is not related to external storage as all data is on the local hard drive. Since amarok seems to get this right, maybe they can give some hints on how they track every change without having to re-scan all folders. The original statment by Sebastian was, that there is no other way to track every kind of change but to re-scan all folders. Well, I have become to believe that proper use of inotify actually solves this issue. I have written about my findings in another bug report for nepomuk. See my post here: https://bugs.kde.org/show_bug.cgi?id=196402#c11 With regard to Amarok, it seems to me that they simply check the mtime of directories to see whether they contain files that need to be rescanned. See this blog post: http://blog.jefferai.org/2009/10/14/speed-never-gets-old-at-least-in-software-1129 Yes -- we check directory mtimes, which generally works pretty well except for filesystems that don't have/properly update mtimes :-| We could hook into inotify, and we've explored that in the past, but it'd be a linux-specific thing (and it brings some other complexities into the works). I can provide more details if anyone wishes -- the way that the changes in that blog post were made is that we now give the collection scanner the mtimes of the directories instead of just the directory list itself, which allows us to skip subfolders that haven't changed. checking the dir's mtime is not enough since that does not change if a file in the dir changes. I see no other way than scanning all folders for changes since a lot could have changed while Nepomuk was not running. Sorry, I thought this complaint was about Amarok. For Nepomuk, I agree, checking the dir's mtime isn't enough. It's a currently-acceptable (although less than ideal) situation for Amarok. Can I close this bug? After all there is no other way to make sure we get all new files than running through all folders and checking all files. Sure, and a related suggestion: set the default set of folders to index to ~/Documents and below instead of ~, Well, the problem does exist. And just because there isn't a good solution right now doesn't mean there can't be one. There is one SUSE kernel developer who has a kernel patch that would help with this problem, I just need to make him finally finish and submit it. I don't see how a kernel patch can help here. While Nepomuk is not running a lot could happen and we need to find these changes, too. The best inotify replacement won't help for that scenario. As for ~/Documents: can't you do that via a global configuration file for SuSE? I'm not talking about an inotify replacement, am I? Anyway, unless you know a kernel developer with some spare time, I'll get back here as soon as the feature is usable. @Lubos: I have no idea what you are talking about. You only wrote "a kernel patch that would help with this problem". So since I could not think of a kernel patch that would help with the initial indexing problem I thought you meant the monitoring of file operations. Care to spare a few details? The idea is basically a kind of recursive mtime. When something changes, the flag propagates all the way up. So when checking a directory tree, recurse only in parts where the flag is set. The idea was initially for kbuildsycoca but it should be usable e.g. for strigi too. I believe this bug report can be closed now thanks to Sebastian TrĂ¼g's reworking of the indexing infrastructure. See: http://websvn.kde.org/?view=revision&revision=1104720 http://websvn.kde.org/?view=revision&revision=1104721 I agree that this can be closed. Can someone please confirm? (In reply to comment #15) > I agree that this can be closed. Can someone please confirm? I don't have KDE installed from trunk, so I cannot confirm if it works till KDE 4.5 is out. So I'd suggest to change the bug status to resolved and keep it open until someone has confirmed it works. Closing as it cannot be reproduced since 4.5 anymore. |