Bug 319208 - KMail unresponsive after POP3 retrieval and filtering with high MySQL load for minutes
Summary: KMail unresponsive after POP3 retrieval and filtering with high MySQL load fo...
Status: RESOLVED FIXED
Alias: None
Product: Akonadi
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 4.10
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-02 09:06 UTC by Martin Steigerwald
Modified: 2015-03-12 12:22 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments
atop raw file (347.55 KB, application/octet-stream)
2013-05-02 09:15 UTC, Martin Steigerwald
Details
A sample snapshot of atop text ui with high MySQL load (11.20 KB, text/plain)
2013-05-02 09:18 UTC, Martin Steigerwald
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Steigerwald 2013-05-02 09:06:08 UTC
I set component to general as I do not know which compoment fits.

After my fifth migration attempt seemed to be successful, I still have performance issues. For this migration attempt, I put my old KMail-1 maildir contents (no mbox files in there) into ~/.local/share/local-mail and started with a fresh KMail configuration and newly created filter rules (no import). This works so far without any data losses as I experienced in my attempts with mutiple maildir resources (see bug #318444 and bug #318227, bug #318290, all on BTRFS,  bug #319175 on Ext4). All Akonadi and Nepomuk data is stored on Ext 4 except for contacts and std.ics calender.

The performance issue is as follows: KMail becomes unresponsive for minutes after a POP3 retrieval and mail filtering actions. I see mysql at 100-200% for minutes. I have been a bit late at pointing mytop at it, but I intend to collect some more data.

This happens on a ThinkPad T520 with a dualcore Intel i5-2520M with 2,50 GHz, that the ThinkPad easily overclocks to up to 3,2 GHz for minutes, 8 GiB of RAM, and a 300 GiB Intel SSD 320. This is a hardware were I consider such responsiveness issues in a mail program as bug.

Reproducible: Always

Steps to Reproduce:
1. Create a pop account.
2. Stuff some 5 GB messages in 100 folders and some about 30 filter rules at it
3. Have quite some mailinglists flow into it, including high volume ones like LKML, debian-user and such.
4. Wait a night.
5. Press button to receive mails.
Actual Results:  
KMail becomes unresponsive for minutes. It took at least a minute to actually filter all mails out of the inbox, but then also remains unresponsive for at least five more minutes.

Clicking on a mail yields nothing or just the blue wait folder. During that MySQL uses up 100-200% of CPU time consistently:

merkaba:~> atopsar -O -b 10:20 -e 12:00 

merkaba  3.9.0-tp520  #7 SMP PREEMPT Mon Apr 29 15:01:05 CEST 2013  x86_64  2013/05/02

-------------------------- analysis date: 2013/05/02 --------------------------

10:23:30    pid command  cpu% |   pid command  cpu% |   pid command  cpu%_top3_
10:33:30   2459 mysqld   110% |  2500 akonadi_  17% |  2758 kmail      8%
10:43:30   2459 mysqld   116% |  2500 akonadi_  23% | 32672 icewease   6%
10:53:30   2459 mysqld     8% |  2500 akonadi_   6% |  1851 Xorg       5%


MySQL was also a heavy disk user during that time:

merkaba  3.9.0-tp520  #7 SMP PREEMPT Mon Apr 29 15:01:05 CEST 2013  x86_64  2013/05/02

-------------------------- analysis date: 2013/05/02 --------------------------

10:23:30    pid command  dsk% |   pid command  dsk% |   pid command  dsk%_top3_
10:33:30   2459 mysqld    59% | 32428 apt        5% |  5874 akregato   4%
10:43:30   2459 mysqld    32% | 32672 icewease  30% |  2500 akonadi_  12%
10:53:30      1 systemd   88% |  2459 mysqld     6% |   352 btrfs-tr   1%



atop created averages in its 10 minutes recordings.

I will include the atop data file again which is suitable for step by step going through this again to look at whats happening.

Expected Results:  
- KMail remains responsive during POP3 retrieval and filtering
- The MySQL CPU usage and I/O usage remains within bounds that make sense for this kind of workload

I know, what makes sense needs to be evaluated. I admit, I have a huge setup, but I do think that the current resource usage for it is way over top.

What I will try:

1) Just selecting POP3 retrieval for this one main POP3 account in the menu. Cause if I do press the icon in the toolbar, KMail also synchronized that huge sized local maildir resource again, which IMHO doesn´t make any sense to do in that case.

2) Raising some innodb buffer size limit from the IMHO very small 80 MiB to huge 500 MiB to see whether it makes a difference. If it does, I will lower it gradually until maybe about 200 MiB.

martin@merkaba:~/.local/share/akonadi> diff -u mysql.conf.orig-2013-05-02 mysql.conf
--- mysql.conf.orig-2013-05-02  2013-05-01 16:06:58.180288216 +0200
+++ mysql.conf  2013-05-02 11:01:45.024564563 +0200
@@ -42,7 +42,8 @@
 
 # memory buffer InnoDB uses to cache data and indexes of its tables (default:128M)
 # Larger values means less I/O
-innodb_buffer_pool_size=80M
+# HINT: Raised from 80 MiB to a huge 500 MiB to see whether it makes a difference, 2.5.2013
+innodb_buffer_pool_size=500M
 
 # Create a .ibd file for each table (default:0)
 innodb_file_per_table=1


I will try pressing the icon to have both happening at same time and just triggering POP3 retrievel of the main account with the 500 MiB setting. It will take some time, till I have more data.
Comment 1 Martin Steigerwald 2013-05-02 09:15:59 UTC
Created attachment 79617 [details]
atop raw file

You can single step through it with atop -r atop_20130502 and using t/T keys. Only 10 minute resolution tough. I can create a larger file with higher resolution
Comment 2 Martin Steigerwald 2013-05-02 09:18:20 UTC
Created attachment 79618 [details]
A sample snapshot of atop text ui with high MySQL load

Also demonstrates that MySQL seems to use the innodb_buffer_pool_size=500M buffer size quickly. MySQL just had about 120 MiB mem usage before:

 2459       -     4849         7   10664K     3564K   427.5M      136K   466.7M   118.8M      260K     384K     4224K   martin      martin       2%    mysqld

(all in the atop raw file I included)
Comment 3 Martin Steigerwald 2013-05-02 09:32:56 UTC
Just tried to just synchronize the maildir resource, by selecting to look for new mails from it from the pull down menu.

The time of the high CPU usage seems to shorter, but it still pikes to 200% and well it has been less mails since last download.



Mytop shows:

MySQL on localhost (5.5.30-1.1)                         up 0+00:09:33 [11:20:02]
 Queries: 5.0     qps:    0 Slow:     0.0         Se/In/Up/De(%):    00/00/00/00
              qps now:    0 Slow qps: 0.0  Threads:   29 (   3/   0) 00/00/00/00
 Key Efficiency: 100.0%  Bps in/out:   0.3/ 35.4   Now in/out:   8.4/ 2.0k

        Id      User         Host/IP         DB      Time    Cmd Query or State                                                                                      
        --      ----         -------         --      ----    --- --------------                                                                                      
         3    martin       localhost    akonadi         0 Execut SELECT PartTable.id, PartTable.pimItemId, PartTable.name, PartTable.data, PartTable.datasize, PartTa
        30    martin       localhost    akonadi         0 Execut SELECT PimItemTable.id, FlagTable.name FROM PimItemTable INNER JOIN PimItemFlagRelation ON ( PimItem
        31      root       localhost                    0  Query show full processlist                                                                               
         8    martin       localhost    akonadi        79  Sleep                                                                                                     
[… about 20 more sleeping connections …]


MySQL on localhost (5.5.30-1.1)                         up 0+00:09:33 [11:20:02]
 Queries: 5.0     qps:    0 Slow:     0.0         Se/In/Up/De(%):    00/00/00/00
              qps now:    0 Slow qps: 0.0  Threads:   29 (   3/   0) 00/00/00/00
 Key Efficiency: 100.0%  Bps in/out:   0.3/ 35.4   Now in/out:   8.4/ 2.0k

        Id      User         Host/IP         DB      Time    Cmd Query or State                                                                                      
        --      ----         -------         --      ----    --- --------------                                                                                      
         3    martin       localhost    akonadi         0 Execut SELECT PartTable.id, PartTable.pimItemId, PartTable.name, PartTable.data, PartTable.datasize, PartTa
        30    martin       localhost    akonadi         0 Execut SELECT PimItemTable.id, FlagTable.name FROM PimItemTable INNER JOIN PimItemFlagRelation ON ( PimItem
        31      root       localhost                    0  Query show full processlist                                                                               
         8    martin       localhost    akonadi        79  Sleep                                                                                                     
[… about 20 more sleeping connections …]


MySQL usage is high after apparently the POP3 download completed already and KMail was telling it was synchronizing folders.
So I think thats its the local folder synchronization process that clicking on the mail retrieval icon triggers causing issues here.
But then after POP3 I also saw maildir resource synchronizing, but maybe just the folders that have been filtered into? Well, I stop
here with speculation, as I do not really understand what the maildir synchronisation is doing after POP3 download. I thought
the maildir resource would know where the POP3 resource placed new mails, so no need to synchronize…



Well on subsequent tries to just trigger a look for mails in the maildir resource, MySQL CPU usage is down to about 20%, sometimes a bit more, sometimes a bit less:

ATOP - merkaba                                     2013/05/02  11:26:55                                     ---------                                     10s elapsed
PRC | sys    3.82s  | user  13.68s  | #proc    252  | #trun      4  | #tslpi   525  | #tslpu     0  | #zombie    0  | clones     1  |               | #exit      0  |
CPU | sys      34%  | user    139%  | irq       0%  | idle    227%  | wait      0%  |               | steal     0%  | guest     0%  | curf    ?MHz  | curscal   ?%  |
cpu | sys      19%  | user     67%  | irq       0%  | idle     14%  | cpu002 w  0%  |               | steal     0%  | guest     0%  | curf    ?MHz  | curscal   ?%  |
cpu | sys       9%  | user     53%  | irq       0%  | idle     38%  | cpu003 w  0%  |               | steal     0%  | guest     0%  | curf    ?MHz  | curscal   ?%  |
cpu | sys       3%  | user     12%  | irq       0%  | idle     85%  | cpu000 w  0%  |               | steal     0%  | guest     0%  | curf    ?MHz  | curscal   ?%  |
cpu | sys       3%  | user      8%  | irq       0%  | idle     90%  | cpu001 w  0%  |               | steal     0%  | guest     0%  | curf    ?MHz  | curscal   ?%  |
CPL | avg1    0.66  | avg5    1.00  |               | avg15   0.83  |               | csw   155438  | intr   27726  |               |               | numcpu     4  |
MEM | tot     7.6G  | free    1.4G  | cache   2.3G  | dirty   0.2M  | buff   89.6M  | slab  775.1M  | slrec 730.9M  | shmem 227.0M  | shrss  70.4M  | shswp   1.6M  |
SWP | tot    12.0G  | free   11.4G  |               |               |               |               |               |               | vmcom   5.4G  | vmlim  15.8G  |
LVM | erkaba-home2  | busy      0%  | read       0  | write     58  | KiB/r      0  | KiB/w     54  | MBr/s   0.00  | MBw/s   0.31  | avq     9.85  | avio 0.34 ms  |
LVM | merkaba-home  | busy      0%  | read       2  | write     75  | KiB/r     16  | KiB/w     29  | MBr/s   0.00  | MBw/s   0.22  | avq    15.21  | avio 0.18 ms  |
LVM | rkaba-debian  | busy      0%  | read       0  | write     33  | KiB/r      0  | KiB/w     16  | MBr/s   0.00  | MBw/s   0.05  | avq     5.75  | avio 0.12 ms  |
DSK |          sda  | busy      0%  | read       2  | write    159  | KiB/r     16  | KiB/w     37  | MBr/s   0.00  | MBw/s   0.58  | avq    10.95  | avio 0.23 ms  |

  PID      TID    RUID         EUID         THR    SYSCPU     USRCPU     VGROW      RGROW     RDDSK     WRDSK     ST    EXC    S     CPUNR     CPU     CMD        1/2
 1029        -    martin       martin         2     2.04s      6.76s        0K         0K        0K        0K     --      -    R         2     90%     akonadi_agent_
  979        -    martin       martin        30     0.91s      5.32s        0K         0K        0K        0K     --      -    S         1     64%     akonadiserver
  987        -    martin       martin        45     0.03s      0.72s        0K         0K        0K     3140K     --      -    S         1      8%     mysqld



That far for now. I take some more time to see how it goes with the larger innodb buffer setting.
Comment 4 Martin Steigerwald 2013-05-03 18:58:28 UTC
I think raising the limit to 500M helped quite a bit, but what helped even more is:

Go to account settings and then set the maildir resource to update on new start instead of manual check. This way when I click the retrieve mail button just the POP3 resource is triggered. And the maildir resource just takes the mail, but doesn´t start a full synchronisation.

I wonder why the maildir resource is set to synchronize by manual trigger at all.


I wonder whether the default setting to trigger on each manual check is suitable for POP3 configurations.
Comment 5 Martin Steigerwald 2015-03-12 12:22:28 UTC
I didn´t see this since quite some time. Especially not after using the MySQL performance fixed in current Akonadi 1.13 branch. Thus closing.

Thank you for Dan and Milian for improving MySQL performance in recent Akonadi 1.13.