Bug 364114 - moving a folder within one maildir resource is extremely slow and inefficient
Summary: moving a folder within one maildir resource is extremely slow and inefficient
Status: CONFIRMED
Alias: None
Product: Akonadi
Classification: Frameworks and Libraries
Component: Maildir Resource (show other bugs)
Version: 5.2.0
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-06-08 20:04 UTC by Martin Steigerwald
Modified: 2018-01-31 23:06 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Steigerwald 2016-06-08 20:04:29 UTC
Today I dared moving .Computer.directory/mesa-dev-ml with about 86730 mails to .Linux.directory by drag and drop within KMail. After the move operation completed kmail crashed.

The operation took at least one and a half hour on a ThinkPad T520 with Intel Sandybridge i5-2520m dual core CPU and dual SSD BTRFS RAID 1 and 16 GiB of RAM!

Reproducible: Always

Steps to Reproduce:
1. Have a large maildir folder inside another folder.
2. Move it to a different folder


Actual Results:  
1. Akonadi first moves all mails into database or file_db_data depending on threshold. This process is extremely slow at a rate of about 20000-40000 mails in half an hour. During that the mysqld database process frequently uses about 90-99% of one logical core. This process takes about 1,5 hours for 86000 mails.

martin@merkaba:~/.local/share/akonadi/file_db_data> date ; find -type f | wc -l
Mi 8. Jun 20:41:59 CEST 2016
30339
martin@merkaba:~/.local/share/akonadi/file_db_data> date ; find -type f | wc -l
Mi 8. Jun 20:42:07 CEST 2016
30473
martin@merkaba:~/.local/share/akonadi/file_db_data> date ; find -type f | wc -l
Mi 8. Jun 20:42:16 CEST 2016
30494
martin@merkaba:~/.local/share/akonadi/file_db_data> date ; find -type f | wc -l
Mi 8. Jun 20:42:32 CEST 2016
30708
martin@merkaba:~/.local/share/akonadi/file_db_data> date ; find -type f | wc -l
Mi 8. Jun 20:42:53 CEST 2016
31131
martin@merkaba:~/.local/share/akonadi/file_db_data> date ; find -type f | wc -l
Mi 8. Jun 20:45:58 CEST 2016
34562
martin@merkaba:~/.local/share/akonadi/file_db_data> date ; find -type f | wc -l
Mi 8. Jun 21:28:28 CEST 2016
67480

martin@merkaba:~/.local/share/akonadi/file_db_data> cd ..
martin@merkaba:~/.local/share/akonadi> date ; find file_db_data -type f | wc -l ; du -sh db_data 
Mi 8. Jun 21:28:58 CEST 2016
68099
4,9G    db_data
martin@merkaba:~/.local/share/akonadi> date ; find file_db_data -type f | wc -l ; du -sh db_data
Mi 8. Jun 21:33:16 CEST 2016
74039
5,0G    db_data
martin@merkaba:~/.local/share/akonadi> date ; find file_db_data -type f | wc -l ; du -sh db_data
Mi 8. Jun 21:45:13 CEST 2016
95529
5,1G    db_data


2. Then after moving completed it copies the mails to the destination folder. This is quite quick. The mains still remain in file_db_data:

martin@merkaba:~/.local/share/akonadi> date ; find file_db_data -type f | wc -l ; du -sh db_data
Mi 8. Jun 21:53:26 CEST 2016
98737
5,1G    db_data

3. KMail crashes.

During the whole operation KMail does not respond to user input. It is completely blocked.

Expected Results:  
When moving a folder within a local maildir resource, Akonadi does the following:

1. Rename the folder.

2. Store the folder renaming within the database.

3. KMail remains responsive at all times.

In other words: The operation is completed within 10 seconds no matter how many mails are in the folder. Specifically Akonadi does not move each mail one time and then copying it.
Comment 1 Martin Steigerwald 2016-06-23 10:56:16 UTC
Today I tried to be clever:

- I just stopped Akonadi and waited till it was gone (including mysqld process).
- Then I manually moved several large folders with LKML mails, one with 260000+ mails within the maildir to another directory within the very same local folders resource, a directory/folder I use for archival purpose

But again, KMail is unresponsive when I ask it to display the contents of a mail. It can display the list of mails that a folder contains, but displaying a mail content does not work. While it is busy in the background with about 50-120% cpu usage for the mysqld process and about 120-140 MiB – in words more than hundred MiB – of writes to the dual SSD BTRFS RAID 1 every 10 seconds.

So for a simple action of moving a folder it creates write I/Os in excess of several GiB *easily*. Sorry, but this is just broke in my eyes.
Comment 2 Martin Steigerwald 2016-06-23 11:00:01 UTC
And yeah, I bet I know what its doing: Its removing all the stale mails entries from the mysqld database and then adding them again at the new location. Also KMail still showed the folders at their old location for minutes. Now it doesn´t display them there anymore, but in the new location they have no unread mail count, maybe it indexes those folders then.

Honestly I´d expect Akonadi to reference the folder a mail is in by some *id*. And if I ask to move it elsewhere, it notes that it is elsewhere now in some mail folder table and *is* done with it.

Or in whatever other way: As a user I expect a folder move operation to be *instant* or *almost* instant. And cause *next* to no disk I/O at all.
Comment 3 Martin Steigerwald 2016-06-23 11:07:55 UTC
Okay, it took about 10 minutes and several GiB of disk traffic, but it seems it is now done. Folders at their new location do not yet display amount of unread mails and I expect Akonadi to create even more I/O and CPU usage when I click on a folder there, so for now I just won´t and call it a day.
Comment 4 Martin Steigerwald 2016-06-23 11:22:19 UTC
I see how it is challenging to solve this within current Akonadi design. If using a folder ID that is just stored within the database, it won´t detect manual moves. But well, it could write a .folderinfo file into each folder containing a unique hash of the folder. It would then store in database folder xyz has this internal Akonadi ID and this hash and is currently located at this path.

I wonder whether this changes the consequence of a database loss, but I don´t think so, I think it can even provide a path to make filter handling much more robust. If the filter rules stores the hash of the folder, akonadi mailfilter agent can ask Akonadi about the internal database ID for the folder and thus as long as the user does not remove the .folderinfo or whatever it is called file from the folder, the filter rules would still work after a complete database loss. (Of course that doesn´t solve this issue for IMAP accounts, but it would be a start.)

For the individual mail items in the database Akonadi can still use an internal ID, cause on database loss, this information is lost as well, so it will have to reindex all the folders anyway.
Comment 5 Martin Steigerwald 2016-07-24 08:35:29 UTC
https://phabricator.kde.org/T630

is somewhat related.
Comment 6 Martin Steigerwald 2017-05-06 09:15:28 UTC
I just confirmed this again with KMail 5.2.3, KDEPIM and Akonadi 16.04.3. Its still moving that 50000+ mail btrfs mailinglist folder around and its working on that since way more than half an hour. I hope it will eventually complete the job during the day. And then will move it back using my workaround mentioned in comment #1. A move is completed much faster when using that workaround, although Akonadi still does way to much needless work there.
Comment 7 Martin Steigerwald 2017-05-06 09:42:17 UTC
Okay, I now stopped Akonadi after more than one hour of it trying to move that folder around. So far it didn´t move a single mail file. They are all still here:

martin@merkaba:~/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory> find btrfs-ml -type f | wc -l 
63720

martin@merkaba:~/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory> du -sh btrfs-ml
521M    btrfs-ml

Additionally it didn´t even create the target directory.

This is *just* moving a folder!

So my guess is again that it was still occupied with moving all those mails into the cache, i.e. into mysqld database or if threshold exceeded lost and found.

Well sure enough it was:

martin@merkaba:~/.local/share/akonadi> find file_db_data -type f | wc -l
46846
martin@merkaba:~/.local/share/akonadi> du -sh file_db_data
462M    file_db_data

I bet some are also in there.

martin@merkaba:~/.local/share/akonadi> du -sh db_data 
4,2G    db_data

Luckily after the restart Akonadi forgot that it was to move that large folder. So… I will just fsck and then vacuum the whole thing back to some sanity again.
Comment 8 Martin Steigerwald 2017-05-06 09:51:39 UTC
During akonadictl fsck:

Moved 15029 unreferenced files to lost+found.

Also it only found about 21000 files in there at all (output scrolled out of buffer, but I bet it was 15029+6371).

On second run it now reports "Found 6371 external files." and this now seems to be correct:

martin@merkaba:~/.local/share/akonadi> find file_db_data -type f | wc -l
6371

Yet it only moved 15029 files to lost+found:

martin@merkaba:~/.local/share/akonadi> find file_lost+found -type f | wc -l 
15029

I seriously have no idea what it did with the other:

martin@merkaba:~/.local/share/akonadi> find file_db_data -type f | wc -l
46846
(see previous comment #7)

46846 - 15029 = 31817 files

However as I think they have been related to that aborted move operation I don´t mind them to be gone. Still that output is not really trustworthy.
Comment 9 Martin Steigerwald 2018-01-31 23:06:46 UTC
Bug 389638 - When moving imported maildir folders, before they are completely written to disk, they arrive empty, losing all the mails.

may be related.