Bug 341884 - dozens of duplicate mails in ~/.local/share/akonadi/file_db_data
Summary: dozens of duplicate mails in ~/.local/share/akonadi/file_db_data
Status: RESOLVED FIXED
Alias: None
Product: Akonadi
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 1.13.0
Platform: Kubuntu Linux
: NOR major with 40 votes (vote)
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
: 344937 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-12-14 16:28 UTC by m.eik michalke
Modified: 2016-01-24 20:50 UTC (History)
8 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description m.eik michalke 2014-12-14 16:28:54 UTC
i noticed that in my ~/.local/share/akonadi/file_db_data folder there were more than 150000 (150k) mails. a lot of them were duplicates of the same mail, sometimes 70 copies and more: they all had the same base name/number and ended in _r0 to _r7x, and diff revealed they are all identical copies.

Reproducible: Always

Steps to Reproduce:
it's a standard kubuntu setup of kmail, using the PPA backpurts packages (4.14.2) with two IMAP resources.

Actual Results:  
~/.local/share/akonadi/file_db_data is crowded with duplicate mail copies.

Expected Results:  
i get it that ~/.local/share/akonadi/file_db_data should just be a cache, akonadi should not copy the same mail over and over again.

if akonadi is aware of some mail already being present (the appended numbering demonstrates that), it could simply use the existing copy for whatever it is doing, or at least clean up afterwards. this is an immense waste of disk space. whatever the circumstances, this clearly shouldn't happen.
Comment 1 m.eik michalke 2014-12-14 22:57:42 UTC
i manually called "akonadictl fsck". after waiting for some hours, it moved over 110k of those duplicated mails to a new folder called "file_lost+found". the total disk usage for this unsused stuff is 14,5GB. i call that room for improvement ;-)
Comment 2 Martin Steigerwald 2015-01-21 09:42:10 UTC
Dan wrote several times that you can safely removed things in file_lost+found. If you are unsure, maybe wait for a confirmation from him. On a Ext3/4 filesystem I would check its lost+found folder before removing things in there, but it seems to Akonadi it may be different.

I think Akonadi shouldn´t be in need of a lost+found folder, as it is supposed to be "just" a cache, but maybe thats something for Akonadi Next.
Comment 3 Martin Steigerwald 2015-01-21 09:45:32 UTC
Okay, I do have some duplicates, but not all that much it seems, but I just did a akonadictl fsck half an hour ago:

ms@merkaba:~/.local/share/akonadi/file_db_data> fdupes -m .
1736 duplicate files (in 1729 sets), occupying 8.6 megabytes

Can you install fdupes (same package name on Debian) and check on your file_db_data?

Thanks.
Comment 4 m.eik michalke 2015-01-21 10:59:07 UTC
thanks for the notice, i already cleared the lost+found folder right away then.

here's my duplicate statistics:

m@wurst ...ocal/share/akonadi/file_db_data $ fdupes -m .
6018 duplicate files (in 3955 sets), occupying 685.6 megabytes

cleaning up:

m@wurst ...ocal/share/akonadi/file_db_data $ akonadictl fsck
m@wurst ...ocal/share/akonadi/file_db_data $ fdupes -m .
307 duplicate files (in 291 sets), occupying 5.8 megabytes

so this is about one CD-ROM worth of junk in one month for me (since i last cleaned up a month ago and removed the 14.5GB of duplicates which had accumulated). the "akonadictl fsck" went really quick this time, the resulting new lost+found folder contained 762MB in 9155 files.
Comment 5 Martin Steigerwald 2015-01-21 11:27:00 UTC
Wow! Okay, I think thats a clear bug.

1) There are duplicates at all.

2) Akonadi does not clean up after itself.
Comment 6 Bernhard Jungk 2015-05-26 22:32:09 UTC
This problem is becoming unbearable for me. :-(

~/.local/share/akonadi/file_db_data$ fdupes -m .
84860 duplicate files (in 3857 sets), occupying 13738.0 megabytes
Comment 7 Martin Steigerwald 2015-05-27 09:25:40 UTC
Bernhard, did you try anything to mitigate / work-around the issue? I suggest at least a run of "akonadictl fsck" as I show it below. But you may also try the SizeTreshold=32768 thing to mitigate the duplicates issues at least a bit – maybe, I am not 100% positive that it help, but I think it does, cause it causes payloads smaller than 32 KiB to go into the database and thus will in general reduce amount of files in file_db_data. By this I think it will reduce the chance of duplicates being created by a confused Akonadi server.

I thought I added it to the bug report, but I think there are at least two open regarding file_db_data.

Ah, I mentioned in the other bug:

https://bugs.kde.org/show_bug.cgi?id=338402#c9
https://bugs.kde.org/show_bug.cgi?id=338402#c11

(These are really two different issues, one is about duplicates, one is about the hugeness of that cache and probably also left over files or just too long cache hold duration, since akonadictl fsck often moves lots of the files to file_db-lost+found)


On my work account a huge Exchange IMAP:

I still do have duplicates, but not nearly as much after setting SizeTreshold=32768 in [%General] of .config/akonadiserverrc

ms@merkaba:~/.local/share/akonadi/file_db_data> fdupes -m .
3753 duplicate files (in 1010 sets), occupying 2495.1 megabytes

Wow, but still, that is 2,4 GiB of larger files then.

Total figures:

ms@merkaba:~/.local/share/akonadi> du -sh *db_data
9,1G    db_data
2,9G    file_db_data



On my private account (POP3 and a smaller IMAP) things seem sane:

martin@merkaba:~/.local/share/akonadi> du -sh *db_data
2,7G    db_data
3,5M    file_db_data

ms@merkaba:~/.local/share/akonadi/file_lost+found> fdupes -m .
2916 duplicate files (in 686 sets), occupying 2206.8 megabytes

Now

ms@merkaba:~/.local/share/akonadi> rm -r file_lost+found 
ms@merkaba:~/.local/share/akonadi>

and more free space. :)

Daniel told several times that it is safe to remove file_lost+found.


Lets fsck the work one. Ok, there we go:

ms@merkaba:~/.local/share/akonadi/file_db_data> fdupes -m .
60 duplicate files (in 52 sets), occupying 8.9 megabytes

    .



See also:
https://lists.debian.org/debian-kde/2015/01/msg00055.html
Comment 8 Martin Steigerwald 2015-05-27 09:27:02 UTC
The akonadictl fsck should go before the size analysis and the rm -r file_lost+found, sorry.
Comment 9 Martin Steigerwald 2015-05-27 09:29:16 UTC
I set to confirmed and major importance as wasting 13 GiB IMHO is.
Comment 10 Bernhard Jungk 2015-05-27 18:07:02 UTC
Thanks for the advice, indeed it seems to be better now. I'll keep an eye on that folder for a while.
Comment 11 Daniel Vrátil 2015-06-29 21:08:28 UTC
commit 9c0dc6b3f0826d32eac310b2e7ecd858ca3df681
Author: Dan Vrátil <dvratil@redhat.com>
Date:   Mon Jun 29 22:45:11 2015 +0200

    Don't leak old external payload files

    Actually delete old payload files after we increase the payload revision or
    switch from external to internal payload. This caused ~/.local/share/akonadi/file_db_data
    to grow insanely for all users, leaving them with many duplicated files (just with
    different revisions).

    It is recommended that users run akonadictl fsck to clean up the leaked payload
    files.

    Note that there won't be any more releases of Akonadi 1.13 (and this has been
    fixed in master already), so I strongly recommend distributions to pick this
    patch into their packaging.
Comment 12 Martin Steigerwald 2015-06-30 07:10:26 UTC
Thanks a lot, Daniel. I notified Debian Qt/KDE team about it.
Comment 13 graham 2015-08-18 15:49:14 UTC
Daniel, *THANK YOU* for fixing this.

I only _just_ installed the fix on my Fedora-21 machine, and after running "akonadictl fsck" saw that it moved out just shy of 90k files from my "file_db_data" directory (leaving me with just under 5k left).  And that's just what's accumulated since I last ran "akonadictl fsck" back on June 30th.
Comment 14 Scarlett Moore 2015-10-08 18:07:05 UTC
I am working on the *ubuntu family packaging for this patch today.
Scarlett
Comment 15 Daniel Vrátil 2016-01-24 20:50:50 UTC
*** Bug 344937 has been marked as a duplicate of this bug. ***