Bug 344937

Summary: Akonadi keeps multiple revisions of the same message in .local/share/akonadi/file_db_data
Product: [Frameworks and Libraries] Akonadi Reporter: Jan K <jprofesorek>
Component: IMAP resourceAssignee: Christian Mollekopf <chrigi_1>
Status: RESOLVED DUPLICATE    
Severity: normal CC: dvratil, farengi, glua, jprofesorek, kdepim-bugs, mail, vkrause
Priority: NOR    
Version: 1.13.0   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:

Description Jan K 2015-03-07 11:47:39 UTC
For some reason in .local/share/akonadi/file_db_data I can see multple copies of the same message, e.g.,: 1983_r0 1983_r1 1983_r10 1983_r11 1983_r12 1983_r13 1983_r14 1983_r15 1983_r2 1983_r3 1983_r4 1983_r5 1983_r6 1983_r7 1983_r8 1983_r9
The files are identical (at least for cmp) and are single ordinary files.

That caused akonadi share folder to grow to some 7G!

I got no idea what caused this, I did some not-so-wise things to diagnose it, and I still got no idea what causes that. 

I do have (and I want to) "keep local copies" checked in my IMAP config dialog, but I wish one local copy for each message, not 16.

To work around this, I did:
===========
cd $HOME/.local/share/akonadi/file_db_data
ls -1 > /tmp/filellist
pushd /tmp
sort -n filellist | sed 's/_.*//' | uniq -c | awk '$1 > 1 {print $2}' > dups
popd
for x in `cat /tmp/dups`; do read f n <<< `echo ${x}_r*`; echo "$f : $n"; for y in `echo $n`; do rm $y; ln $f $y; done; done
===========
This helped me to go from 
# /dev/mapper/vg_ssd-lv_jasiu         9,8G  9,3G  475M  96% /home
to
# /dev/mapper/vg_ssd-lv_jasiu         9,8G  2,7G  7,1G  28% /home

Has anyone an idea why did I end up with multiple copies? Is this true for anyone, or just for me?


Reproducible: Always
Comment 1 Harald Frießnegger 2015-04-07 16:30:31 UTC
same problem here.
the total size of all _r0 files (`du -hsc *_r0`) is 7.4GB.
the total size of all files in `~/.local/share/akonadi/file_db_data` is 25GB.
this means 17.6GB of useless? data on my disk.
i did not check revisions of all files are the same, but a random test on 4 different files showed an empty diff for all of the revisions.

running kmail 4:4.14.2-0ubuntu1~ubuntu14.10~ppa2
Comment 2 Harald Frießnegger 2015-04-15 12:09:30 UTC
i'm constantly running out of diskspace on my ssd because of this issue.

looking at the file_db_data folder today (one week later) it's already 27GB huge.
all _r0 files still at 7.5GB

are there any files that can be manually deleted as long as this bug is not solved?

which one of these could be deleted for example:

404860 Feb 11 11:30 730406_r0
404860 Feb 16 12:20 730406_r1
404860 Mär 18 09:40 730406_r10
404860 Mär 18 13:50 730406_r11
404860 Mär 19 09:22 730406_r12
404860 Mär 23 18:09 730406_r13
404860 Mär 28 00:45 730406_r14
404860 Apr  1 21:22 730406_r15
404860 Apr  2 10:44 730406_r16
404860 Apr  7 16:14 730406_r17
404860 Apr  8 10:46 730406_r18
404860 Apr  9 10:25 730406_r19
404860 Feb 19 08:18 730406_r2
404860 Apr 13 12:00 730406_r20
404860 Apr 14 10:09 730406_r21
404860 Feb 23 10:06 730406_r3
404860 Feb 23 22:40 730406_r4
404860 Feb 24 12:24 730406_r5
404860 Feb 25 16:24 730406_r6
404860 Mär  6 09:18 730406_r7
404860 Mär  9 09:49 730406_r8
404860 Mär  9 15:25 730406_r9

_r0 - _r20 and just keep _r21?
Comment 3 Jan K 2015-04-18 19:26:33 UTC
Ok, so it's not just me, there are at least two of us ;-)

> are there any files that can be manually deleted as long as this bug is not solved?
> _r0 - _r20 and just keep _r21?
Don't do it.
For me it seems that only low revisions (no higher than third) are in fact used (checked database with akonadiconsole).
You can just run the script I provided to create hard links - it's safe but rough workaround.

Now, time for some research...

Finding the part of code that creates the file names is easy:
http://quickgit.kde.org/?p=akonadi.git&a=blob&f=src%2Fserver%2Fstorage%2Fparthelper.cpp (line 297).

So, it may have arrived with the idea introduced with this commit(↓) and subsequent commits
http://quickgit.kde.org/?p=akonadi.git&a=commit&h=422c9112d95e0c168c829bd19fed17b467090de1

Let's move on:
That function is called from
http://quickgit.kde.org/?p=akonadi.git&a=blob&f=src%2Fserver%2Fstorage%2Fpartstreamer.cpp (line 189)
to create name for a new file revision and send the name to client (in line 230) for filling file with data.

Now, the key part is when this function is called. It's called whenever part is invalid (that is: part.isValid() is false)
We have part from line 284 of the partstreamer.cpp.
This is an object of 'Part' class. A class for which I could not locate any definition. And the instance is conjured miraculously from the database.

What I cannot find:
* where is the definition of class Akonadi::Server::Part (or the right Part class)
* when is a valid part considered invalid
* why the database de facto has (for instance) revision 0 and not some of the recent revisions from the file storage dir
* (extra:) is there a routine to purge redundant revisions?
Can anyone help me there? I don't feel hardcore enough to debug live akonadi during mail fetch.

I can provide any details for tackling this, just I don't know which matter…


Sorry if I'm breaking netiquette by this, I'm adding to CC list the person annotated for most commits over there.
Comment 4 Harald Frießnegger 2015-04-27 12:05:59 UTC
Today my /home partition was nearly full again (after I freed 3GB a week ago) and I felt lucky enough to use Jan's script that generates links for those revision files (the luxury version would compare files using md5sum or diff before removing them and creating a symlink)

my filelist had 220552 entries, the hole process took nearly an hour
frisi@mbp-frisi:/tmp$ cat filellist | wc -l
220552
disk space used by the file_db_data directory went down from 29G to 7.6GB
 

i think this bug affects all kmail users (my workmate using fedora has the same problem and CCed himself) but only those using imap in combination with "offline storage" and a lot of messages/attachments really notice the problem when their free disk space shrinks.
Comment 5 Daniel Vrátil 2016-01-24 20:50:49 UTC

*** This bug has been marked as a duplicate of bug 341884 ***