Bug 417070

Summary: KRunner makes excessive writes during searching.
Product: [Plasma] krunner Reporter: Hakan Bayindir <hakan>
Component: generalAssignee: Alexander Lohnau <alexander.lohnau>
Status: RESOLVED FIXED    
Severity: major CC: alexander.lohnau, f4tmike, hakan, kdeu, nate, sgon00
Priority: NOR    
Version: 5.14.5   
Target Milestone: ---   
Platform: Debian testing   
OS: Linux   
Latest Commit: Version Fixed In: 5.20
Attachments: One day of iotop statistics, running in cumulative mode, sorted by disk write.
Detailed KRunner activity report.

Description Hakan Bayindir 2020-02-02 19:09:57 UTC
Created attachment 125623 [details]
One day of iotop statistics, running in cumulative mode, sorted by disk write.

SUMMARY
KRunner can write ~2.2GB of data to disk in a couple of searches.

STEPS TO REPRODUCE
1. Call KRunner with its shortcut.
2. Write something.

OBSERVED RESULT
KRunner makes a lot of disk calls and writes a lot of data, genrally to .cache folder. This wears down SSD disks pretty fast.

EXPECTED RESULT
A more milder disk access pattern (mostly reads, much less write)

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Debian Testing.
(available in About System)
KDE Plasma Version: 5.14.5
KDE Frameworks Version: 5.62.0
Qt Version: 5.12.5

ADDITIONAL INFORMATION

System I/O is monitored via fatrace and iotop. When running fatrace with -cC krunner --filter=CWD in the home directory of the user, the effects of triggering a search is becomes very visible. Keeping iotop open for a single day in accumulation mode and ordered by disk write reveals pretty shocking write statistics for a couple of KDE processes.
Comment 1 Kai Uwe Broulik 2020-02-03 08:57:42 UTC
Thanks for your analysis.
Can you try to figure out which plugin is causing this? E.g. try disabling all search modules in KRunner settings and then see if you spot a pattern?
I know the Bookmarks runner does something stupid with copying some file over, maybe that's already the culprit.
Comment 2 Hakan Bayindir 2020-02-03 09:23:11 UTC
You're welcome. Of course I can do the analysis. I think I can provide an answer towards the end of the week.
Comment 3 Hakan Bayindir 2020-02-04 20:07:55 UTC
I've done the testing. You're right. Bookmarks is the culprit. Reading and re-caching a lot of favicons and data. Creates ~100MB writes per search. Sometimes even bigger.

I'm attaching a detailed report.

From report:

## TL;DR

- **Bookmarks** plugin is killing your SSD. One search at a time.
- Lock files are written to disk. Why not to RAM?
- Software center may be optimized (if possible).
Comment 4 Hakan Bayindir 2020-02-04 20:08:34 UTC
Created attachment 125673 [details]
Detailed KRunner activity report.
Comment 5 Kai Uwe Broulik 2020-02-05 09:09:50 UTC
Wow, that is a lot more effort than I would have expected! So I was right, the stupid bookmarks runner it is.
I think it copies the favicon SQLite database around to avoid simultaneous access from the browser and the runner.

The config updates you're seeing when hitting return is it just adding the query to the history in the view and then the KRunner library noting that down to score the selected result higher in subsequent queries.
Comment 6 Kai Uwe Broulik 2020-02-05 09:14:31 UTC
Apparently SQLite 3 does allow concurrent connections. We only do a SELECT on the database, no modification, so I believe this copying to a temp file isn't necessary anymore.
Comment 7 Kai Uwe Broulik 2020-02-05 09:43:55 UTC
So yeah, the database file is completely locked and all my queries fail while the browser is open. Bummer :/
Comment 8 Hakan Bayindir 2020-02-05 12:34:22 UTC
I've also tried to access the DB with SQLite browser but, it's not possible as you've said.

Maybe we can raise the issue with Mozilla, so they can release the locking pressure on the .sqlite files.

It looks like Firefox is obtaining an exclusive lock and not giving it away until the end. Can this be also a bug on their end, since the SQLite locking document [0] tells that SQLite actively trying to avoid exclusive locking on its databases?

From the document:

> In order to maximize concurrency, SQLite works to minimize the amount of time that EXCLUSIVE locks are held.

[0]: https://www.sqlite.org/lockingv3.html
Comment 9 Kai Uwe Broulik 2020-02-05 13:44:59 UTC
Chrome excerts the same thing. We can still ask them I guess :)
Comment 10 Alexander Lohnau 2020-05-27 12:09:15 UTC
I already write a patch for this (without knowing of this bug):

https://phabricator.kde.org/D29726.

The there were two problems:
1: The files are deleted/copied for every match session, even if they didn't changed. They have to be copied because they are locked if the browser is open.
2: There was a bug that the prepare/teardown signal get called for each letter typed, this means that the files were copied for each letter typed (BUG 420311). This has also been fixed.
Comment 11 Alexander Lohnau 2020-05-27 13:37:54 UTC
Edit: I just noticed the fix won't be included in Plasma 5.19, because it got merged into master.

The commit is: https://invent.kde.org/plasma/plasma-workspace/-/commit/61b02b16f4205d2d9b212a239fa65fe7710acc30 

@broulik Should I just cherry-pick this on the Plasma/5.19 branch? I am not sure what the best way to handle this is.

Or could you be so nice and do that 😅
Comment 12 Hakan Bayindir 2020-05-31 12:56:20 UTC
Hello Alex,

Thanks for the patches and all the hard work. I have a question.

As far as understood, the locking in SQLite is as follows [0]:

1. In most cases many clients can have locks on a SQLite DB.
2. If a client is preparing to write, all other locks become read-only.
3. To be able to actually write, all other locks shall clear (i.e. other 
clients shall close connection). Otherwise writes hang until all other locks 
clear.

Because of 3., Firefox and Chrome claims exclusive locks on all SQLite DBs on 
startup. Considering the DBs may get extremely big (see my survey about writes 
in this), can a more sophisticated logic can be implemented? It shall work as 
follows:

1. If the DBs are not locked, sync the changes to local Bookmark DBs.
2. If the DBs are locked and KDE integration plugins are available, sync the 
changes over the plugin connection (since plugins can read the bookmarks).

This will further reduce the write load on system SSDs since the delta will be 
much smaller than copying a whole DB over and over.

Since SSDs getting cheaper and TBW values doesn't increase significantly over 
the years, minimizing writes looks like a wise ideal to pursue.

What's your opinion on this?

Regards,

Hakan

[0]: https://www.sqlite.org/lockingv3.html

On 27 Mayıs 2020 Çarşamba 15:09:15 +03 you wrote:
> https://bugs.kde.org/show_bug.cgi?id=417070
> 
> Alex <alexander.lohnau@gmx.de> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
> CC|                            |alexander.lohnau@gmx.de Resolution|---     
>                    |FIXED
>              Status|CONFIRMED                   |RESOLVED
> 
> --- Comment #10 from Alex <alexander.lohnau@gmx.de> ---
> I already write a patch for this (without knowing of this bug):
> 
> https://phabricator.kde.org/D29726.
> 
> The there were two problems:
> 1: The files are deleted/copied for every match session, even if they didn't
> changed. They have to be copied because they are locked if the browser is
> open. 2: There was a bug that the prepare/teardown signal get called for
> each letter typed, this means that the files were copied for each letter
> typed (BUG 420311). This has also been fixed.
Comment 13 Alexander Lohnau 2020-05-31 18:19:47 UTC
This sounds good!

I currently have another patch waiting to be reviewed which improves the overall API design of the runner plugin.

Another things that I would implement (brainstorming):
- The implementation if the setup/teardown methods doesn't make that much sense and is quite wasteful

- Reuse the database connections as long as the file isn't changed (depends issue above being improved)

- Some better mapping of the url to the favicons, if you have 3 bookmarks of the same domain with the same icon the icon is only once as a blob in database. But if you query the runner the icon will get written to disk for each entry.

- If the mapping between url and icon works well, maybe the favicon database doesn't need to be copied. We could just read it (for instance at startup when the browser isn't running) and create the icon files. This is similar to what you suggested.
Then we could also implement a logic which goes like: The already cached favicon files are used, as long as there is a favicon for every requested bookmark. Otherwise we will update the cached files.
This way don't copy the database if no favicon of bookmarks is affected (if you visit a website and you go to your browser history you will get the favicon displayed, but we don't need the favicon, if the link is not bookmarked).

- Also reusing the results from a previous query, if we already searched for `kru` and continue to type `krunner` letter for letter we don't need to make sql queryies for each letter typed.

This will reduce the disk write operations to a minimum and will improve the performance.
These are just my general ideas in regards to the bookmarks runner and they are going further that this bugfix. But because you also had good ideas on how to further improve this I thought it is worth mentioning^^.

And considering that this plugin is also enabled by default in the application launcher it seems worth to optimize it further :-).
Comment 14 Alexander Lohnau 2020-06-25 13:19:36 UTC
Git commit edc64d04a1e569d7032c41e6ee0ebf59833c26f2 by Alexander Lohnau.
Committed on 25/06/2020 at 13:18.
Pushed by alex into branch 'Plasma/5.19'.

Fix broken ENV variables for detailed settings
Related: bug 176650, bug 327757, bug 176650

M  +21   -10   startkde/startplasma.cpp

https://invent.kde.org/plasma/plasma-workspace/commit/edc64d04a1e569d7032c41e6ee0ebf59833c26f2
Comment 15 Alexander Lohnau 2020-07-13 13:30:22 UTC
Git commit 7efea66e1e059c6ce54435ae3832716e209b334f by Alexander Lohnau.
Committed on 12/07/2020 at 07:15.
Pushed by alex into branch 'master'.

Fix single runner mode

We have to pass the m_runner parameter to the runner manager,
otherwise it defaults to an empty string.

The m_runner is already correctly when we
activate the single runner mode using dbus.

M  +1    -1    lib/runnerresultsmodel.cpp

https://invent.kde.org/plasma/milou/commit/7efea66e1e059c6ce54435ae3832716e209b334f
Comment 16 Nate Graham 2020-07-15 19:33:54 UTC
We had to revert the commit that fixed this as it caused Bug 423995. We'll work on finding another way.
Comment 17 Lukas Ba. 2020-08-26 23:45:29 UTC
Does this issue also apply to the startmenu? You can also search and find bookmarks in the startmenu.
Comment 18 Alexander Lohnau 2020-08-26 23:48:31 UTC
(In reply to Lukas Ba. from comment #17)
> Does this issue also apply to the startmenu? You can also search and find
> bookmarks in the startmenu.

Oh, sorry just realized that I had pasted a wrong bugnumber once and thus this one got reopened. This is actually fixed and I recently made another improvement to the caching.

But you are absolutely right regarding the startmenu :-)
Comment 19 Lukas Ba. 2020-08-27 00:07:44 UTC
Is this *fixed* in the startmenu as well?
Comment 20 Nate Graham 2020-08-27 03:03:07 UTC
Yes; they both use the same KRunner runner.
Comment 21 sgon00 2020-08-28 02:25:23 UTC
Can anyone tell me which exact version of krunner has this bug? Thanks a lot.