Bug 384880 - Dirwatcher inefficiency when using QFileSystemWatcher
Summary: Dirwatcher inefficiency when using QFileSystemWatcher
Status: VERIFIED UNMAINTAINED
Alias: None
Product: kdevplatform
Classification: Developer tools
Component: filemanager (show other bugs)
Version: unspecified
Platform: Compiled Sources All
: NOR normal
Target Milestone: ---
Assignee: kdevelop-bugs-null
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-20 12:46 UTC by RJVB
Modified: 2018-04-16 10:13 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
profiling overview on Mac, dirwatcher creation (148.43 KB, image/png)
2017-09-20 14:18 UTC, RJVB
Details
profiling overview on Mac, dirwatcher destruction (225.97 KB, image/png)
2017-09-20 14:19 UTC, RJVB
Details
on-demand set-up of dir-only dirwatching (15.20 KB, patch)
2017-09-25 18:33 UTC, RJVB
Details
on-demand set-up of dir-only dirwatching (18.54 KB, patch)
2017-09-25 19:02 UTC, RJVB
Details

Note You need to log in before you can comment on or make changes to this bug.
Description RJVB 2017-09-20 12:46:28 UTC
Creating dirwatcher is largely negligible in terms of cost on Linux when inotify is (can be used).

For instance, when importing a pre-existing project for the GCC 7.2.0 source tree on Linux (1.6Ghz N3150 CPU, ZFS on an SSHD), with dirwatcher creation deferred to after the initial project load and full-project parsing off:

kdevplatform.shell: Starting project import timer
kdevplatform.shell: All projects imported in 17.948 seconds
kdevplatform.filemanager: Project dir monitoring is enabled for project "gcc-7.2.0-MP"
kdevplatform.shell:     creating dir watchers and/or starting parse jobs took an additional 1.189 seconds
kdevplatform.filemanager: Deleting dir watcher took 0.024 seconds for project "gcc-7.2.0-MP"
31.069 user_cpu 5.016 kernel_cpu 0:34.10 total_time 105.7%CPU

When forcing the use of QFileSystemWatcher (using KDIRWATCH_METHOD=QFSWatch), those figures become

kdevplatform.shell: Starting project import timer
kdevplatform.shell: All projects imported in 17.701 seconds
kdevplatform.filemanager: Project dir monitoring is enabled for project "gcc-7.2.0-MP"
kdevplatform.shell:     creating dir watchers and/or starting parse jobs took an additional 179.013 seconds
kdevplatform.filemanager: Deleting dir watcher took 256.16 seconds for project "gcc-7.2.0-MP"
470.119 user_cpu 19.310 kernel_cpu 7:39.48 total_time 106.5%CPU

(On Mac OS X with a 2.7Ghz i7 and HFS on a regular 7200RPM HDD, creation:destruction take roughly 85:176 seconds)

These long wait times are spent entirely in KDirWatch method, blocking the thread's event loop (and the UI in this case) and running at 100%CPU (on Mac and Linux alike).

Deferring dirwatcher creation means the host OS has had opportunity to cache the directory structure, and indeed profiling the operation on OS X shows that the time is mostly spent in memory allocation and string comparison.

I think the extent of the overhead is due to an inefficiency in the way dirwatching is implemented. A project nothing but a directory which will usually contain other directories. Rather than doing actual dirwatching (watching the project directory for changes), KDevelop installs watchers on each and every item in the project tree. Notifications are only used for keeping the tree in the project toolview uptodate. The KDirWatch::created signal can lead to emitting a fileAdded signal, but that KDirWatch signal (quote) " is not emitted when creating a file is created in a watched directory". IOW and IIUC that signal is thus only emitted when a watch was added for a non-existent file, which KDevelop doesn't do.

All this to say that I think it should be possible to rewrite this implementation to watch only project directories for change, and invoke `eventuallyReadFolder()` when a change is detected. This is an automated version of how I have always worked with KDevelop on Mac (where dirwatching doesn't work), and I don't have the impression that I'm missing any features because of it.

Adding only project directories to the project dirwatcher will 1) speed up its creation on all platforms and 2) will help avoid file descriptor depletion on BSD variants (the kqueue-based QFSW backend uses 2 filedescriptors per watched item, if memory serves me well, and depletion is not just a theoretical possibility as I've observed with Qt4/KDevelop4 on Mac - OS X is a BSD variant too). And finally 3) it should be possible to make the non-recursive KDirWatch::addDir() call in eventuallyReadFolder or even in the FileManagerListJob instance it creates. That could improve efficiency even more by reducing the number of directory traversals.

NB: change detection in open documents does not depend on project dirwatching so will not be affected by the proposed change.
Comment 1 Milian Wolff 2017-09-20 13:29:33 UTC
Thanks for creating this report so we can track this properly.

Some notes:

a)
- kdev should only watch directories, CMakeLists.txt files and open files for changes
- the CMakeLists.txt used to be required, now that we use the cmake server it may not be required anymore, we'll have to check that again.
- so if you say all files are watched then this needs to be fixed - it shouldn't be the case

b)
- if using QFSWatch shows hotspots in memory allocation and string comparison, then those should be fixed in QFileSystemWatcher upstream. Writing a separate benchmark (standalone from kdevelop) for that purpose would help a lot. I imagine KDirWatch would contain a benchmark, if not, then one should be added that resembles the workload you are seeing. Similarly, such a benchmark would be helpful upstream in Qt itself

So I think a) is the actual bug you have uncovered and that needs to be fixed. b) can then also be done on top of that, to improve the situation even further
Comment 2 RJVB 2017-09-20 14:16:52 UTC
If CMakeLists.txt files are to be monitored that should be an additional action taken by the CMake project manager, not by a generic file manager. And that was once necessary it should not be removed until support for older cmake versions is officially and completely removed. (Which in turn shouldn't happen until cmake server mode works as reliably and efficiently as the json-based mode, IMHO).

The venom is in this call:

d->m_watchers[project]->addDir(project->path().toLocalFile(), KDirWatch::WatchSubDirs | KDirWatch:: WatchFiles );

Removing the KDirWatch::WatchFiles should already be an improvement, but I'd hope that doing the addDir() only for folders that are actually being loaded might at the same time avoid putting a monitor on excluded folders. Assuming that's OK of course.

> b)
Maybe. I've queried the Qt ML for feedback on this, and for the existence of a dedicated benchmark but I'm going to go with the null hypothesis that the QFSW API isn't new and has had ample opportunity for fine-tuning by Qt experts. We're just using it in a very intensive way here.

I think that the GCC source tree contains almost 8000 files in I don't know how many directories. It doesn't strike me as odd that memory allocation and string comparisons show up as hotspots when creating or deleting that many QFSW entries involve that kind of operation. I'm a bit surprised that "only" 8000 or so allocations and string comparisons can be that expensive in terms of processing time but not incredulous. It's really not something I want to dig into at the moment.
Comment 3 RJVB 2017-09-20 14:18:45 UTC
Created attachment 107915 [details]
profiling overview on Mac, dirwatcher creation
Comment 4 RJVB 2017-09-20 14:19:45 UTC
Created attachment 107916 [details]
profiling overview on Mac, dirwatcher destruction
Comment 5 Milian Wolff 2017-09-20 15:00:19 UTC
a) if my memory serves me right (I've written this a long time ago), I think the WatchFiles there may have been required to get notified about file deletion events. Anyhow, I agree that this needs to be improved somehow. Maybe this isn't needed anymore, maybe it was never required in the first place. Writing better tests for this would be the first way to fix this.

b) You said:

"I'm going to go with the null hypothesis that the QFSW API isn't new and has had ample opportunity for fine-tuning by Qt experts. We're just using it in a very intensive way here."

From my long time experience working with Qt, this shouldn't be your null hypothesis. I've found and fixed quite a few glaring performance issues in Qt over the years, and most simply arose when you pushed a bit more load at the public API in ways that apparently noone did so far. Maybe we do that here due to the misconfiguration in a). Nevertheless ruling out performance optimizations upstream is a very bad idea at this stage.

Just looking at the profile reports, QStringList pops up - a very bad idea if that's getting queries in O(N) all the time. Using a QSet or sorted vector could already dramatically improve the runtime performance.
Comment 6 RJVB 2017-09-20 15:27:55 UTC
I see that KDirWatch was first written in '98. If you wrote this code that long ago too, then it's probably past due that someone looks at it...

And the same can evidently be said about "well-preserved" APIs in Qt. Funny, really, in evolutionary biology that term is mostly seen as something positive :)
Comment 7 Milian Wolff 2017-09-21 09:45:19 UTC
I didn't mean I write KDirWatch, only the usage of that API in KDevelop is mostly my code nowadays. Still, I think I wrote that 5-7 years ago or so, while it was still only part of the generic manager.
Comment 8 RJVB 2017-09-21 10:43:09 UTC
Either way, KDirWatch *is* that old. Most code in it was last touched during a Jenkins commit in 2013 (transition to KF5?). Some work has been done on it recently but still it too checks each and every entry being added against a list of already added entries (and its contains() implementation looks like it might not be optimised at all). Optimising this code is closer to home than QFSW even if the main cost is clearly with QFSW.

Anyway I've been doing some tinkering. Curiously it seems that removing the KDirWatch::WatchFiles flag from the addDir() call *increases* the time spent in that method when the QFSW method is used (and by no small amount).

OTOH, using KDirWatch::WatchDirOnly (in eventuallyLoadDirectory()) and the dirty instead of the created signal gives a dirwatcher that seems to work as it should (for the toplevel folder and any folders I reload manually). It even works on Mac.

I found an unsent email in which I intended to ask if deferring feeding the dirwatcher to the FileManagerListJob class would be an acceptable approach. That's the class that apparently does the actual project loading, and I'd thus expect it to be the logical place to add individual project directories, on a backgrond thread and with minimal event loop blocking.
Comment 9 RJVB 2017-09-25 18:33:54 UTC
Created attachment 108017 [details]
on-demand set-up of dir-only dirwatching

This is a PoC implementation of my idea of deferring the dirwatcher feeding to the worker that does the actual project directory traversal (the FileManagerListJob class), for now coupled with the possibility to disable dirwatching for my own convenience.

Initial tests show that
- big project import is a *lot* faster
- dirwatching actually works on Mac.

I've used a QSet to avoid repetitive calls to KDirWatch::contains(). I haven't timed it, but judging from the latter's implementation the former should be considerably faster. 

I hope I haven't missed a place or two where entries have to be removed from the dirwatcher.

Let me know if this is mature enough to phab it?
Comment 10 RJVB 2017-09-25 19:02:17 UTC
Created attachment 108020 [details]
on-demand set-up of dir-only dirwatching
Comment 11 Milian Wolff 2017-09-25 20:37:13 UTC
If you want to have it reviewed, put it up on phabricator - attaching it here isn't helping anyone and just wastes our time.

Regarding QSet (without reading the code): Why not push that into KDirWatch to get the efficient lookup everywhere, not just in KDevelop? As I said, please upstream everything possible.
Comment 12 RJVB 2017-09-25 23:53:35 UTC
> Why not push that into KDirWatch to get the efficient lookup everywhere, not just in KDevelop?

And then what, let everyone who doesn't have the latest frameworks version stick with the non-optimised version?
Comment 13 Sven Brauch 2017-09-26 08:19:21 UTC
Yes. That is no sufficient reason to go with the technically inferior version in this case.
Comment 14 Milian Wolff 2017-09-26 08:31:09 UTC
Yes.
Comment 15 RJVB 2017-09-26 09:01:07 UTC
FWIW, none of my former employers (national scientific research institutions) would buy software from vendors with that kind of attitude.

I'll propose the approach upstream and probably the lock too. But I'm not going to allow myself to get side-tracked in a lengthy and costly review and optimisation process of a utility class that may well only benefit of it in the very specific case of working with big projects in KDevelop.
Comment 16 Sven Brauch 2017-09-26 09:02:26 UTC
Well, we do distribute the AppImage for a reason.
Comment 17 RJVB 2017-11-25 14:58:49 UTC
I think that the import timings obtained with the recent abstractfilemanagerpluginbenchmark show that there is a problem with the current dirwatcher implementation when it's not using the INotify backend, in addition to the fact that watching all project entries is inappropriate in that case.

Limitations on other platforms suggest that only project directories and their subdirectories should be watched and D7995 has shown that one can make a feature-complete dirwatching implementation based on that principle that also removes all risk of a UI lockup.

Anyone picking this up should however be aware of #387238 which will go from a hypothetical to a real potential issue.