Bug 361737 - slowdown on pop3 download until it crawls to an halt on saving messages with leave mails on server enabled
Summary: slowdown on pop3 download until it crawls to an halt on saving messages with ...
Status: REOPENED
Alias: None
Product: Akonadi
Classification: Frameworks and Libraries
Component: POP3 Resource (show other bugs)
Version: GIT (master)
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-13 21:42 UTC by Martin Steigerwald
Modified: 2016-05-11 13:58 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
excessive duplicates in akonadi_pop3_resource_0rc (1.71 KB, application/x-xz)
2016-04-28 20:11 UTC, Martin Steigerwald
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Steigerwald 2016-04-13 21:42:03 UTC
This happened after I downgraded Akonadi from git master to Applications/16.04 due to a build failure for messagelib and thus kmail last weekend. It persists after upgrading to git master again.

Basically in one POP3 resource, my main mail account akonadi POP3 resource duplicates the same value in SeenUidList and SeenUidTimeList in ~/.config/akonadi_pop3_resource_0rc exceeds 100 KiB or even 500 KiB. And it uses up one core of an Sandybridge i5 mobile cpu for minutes on saving a few hundred messages to the inbox in batches of maybe 2-5 messages a time.

Reproducible: Always

Steps to Reproduce:
I am not sure what triggered this behaviour. It happened first after the downgrade to Applications/16.04 from git master as of two weeks ago or so.

Actual Results:  
Extremely slow and CPU consuming POP3 mail download.

Expected Results:  
Each item just once in SeenUidList + SendUidTimeList. Fast POP3 download.

Here is an example file:

[General]
host=[…]
intervalCheckEnabled=true
intervalCheckInterval=120
leaveOnServer=true
leaveOnServerDays=2
login=[…]
pipelining=true
targetCollection=361
useTLS=true

[LeaveOnServer]
seenUidList=00315ba8446ce738,00315ba9446ce738,00315baa446ce738,00315bab446ce738,00315bac446ce738,00315bad446ce738,00315bae446ce738,00315baf446ce738,00315bb0446ce738,00315bb1446ce738,00315bb2446ce738,00315bb3446ce738,00315bb4446ce738,00315bb5446ce738,00315bb6446ce738,00315bb7446ce738,00315bb8446ce738,00315bb9446ce738,00315bba446ce738,00315bbb446ce738
seenUidTimeList=1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,1460582993

It always replicated the same value. I have a file with 100 KiB here in case you need it.

As a work-around I disabled leaving mails on server so that it deletes them immediately. The above config is after removing seenUidList and seenUidTimeList manually from config file and having it run with delete all downloaded mail on server again and it just starts duplicating the values again.

The work-around of delete all mails on server didn´t work either, as at least once each mail appeared twice in inbox. So I enabled leaving mails on server again, but this time only for 2 days, but then it again brought the size of this config file that actually mixes *state* into config to more than 100 KiB and slowling down message download considerable just within 1-2 days.
Comment 1 Albert Astals Cid 2016-04-25 23:09:44 UTC
Where is the duplicate? I can't see any in the example you pasted
Comment 2 Laurent Montel 2016-04-26 05:01:07 UTC
I confirm that pop3 is very slow this last week.
And it took 100% of cpu each time.

When I look at paste I see that it's duplicate:
see:
seenUidTimeList=1460582993,1460582993,1460582993,1460582993,1460582993,1460582993,14605829 :)
Comment 3 Laurent Montel 2016-04-26 05:01:57 UTC
Same for me it's duplicate:
seenUidTimeList=1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456260920,1456263334,1456263334,1456263334,1456263334,1456263334,1456263334,1456263334,1456263334,1456263334,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456265040,1456292814,1456292814,1456292814,1456292814,1456292814,1456292814,1456292814,1456292814,1456292814
Comment 4 Albert Astals Cid 2016-04-26 22:47:30 UTC
A duplicate seenUidTimeList is not a problem is it?

My rework to the pop3resource made it a bit slower when leaving messages on server, but i was not expecting "for minutes"

I mean you have only around 55 items in seenUidTimeList it should not be that horribly slow.

Can you check if reverting https://github.com/KDE/kdepim-runtime/commit/e7ba619bef10fed4c1e623c61822da505b270d2a makes it faster again?
Comment 5 Laurent Montel 2016-04-27 05:09:43 UTC
It's faster when I revert it.
I store 30 days on server for kde.org mail... so it's not just 55 items for me and it's very slow with master pop3 resource.
And pop3 doesn't take now 100% of cpu.
Comment 6 Albert Astals Cid 2016-04-27 22:21:02 UTC
Laurent, can you check if https://git.reviewboard.kde.org/r/127775/ helps?
Comment 7 Laurent Montel 2016-04-28 05:34:23 UTC
It helps for me.
Perhaps that Martin Steigerwald can test it too.

I test all the day.
Comment 8 Martin Steigerwald 2016-04-28 20:11:09 UTC
Created attachment 98669 [details]
excessive duplicates in akonadi_pop3_resource_0rc

Albert, the duplicates easily get excessive. Here an example of it with a retention time of just 2 days.
Comment 9 Martin Steigerwald 2016-04-28 20:12:20 UTC
Hi Laurent, I can´t test it, as I do not have Qt 5.6 on my Debian Sid/Experimental setup and I gathered that KDEPIM requires it since a week or so. I do have kdesrc-build and while a recompile of KDEPIM is easy enough, I don´t know how easy it is to also build Qt 5.6 from source and set it up so that it is preferred over the distro Qt, so I decided to wait for Qt 5.6.1 to arrive in Debian, before I recompile KDEPIM.
Comment 10 Albert Astals Cid 2016-04-28 21:00:22 UTC
> Albert, the duplicates easily get excessive. Here an example of it with a retention time of just 2 days.

It is not duplication, it's the time each of the ids of the messages was seen/saved, it just happens that they were seen/saved, that is not a bug and is always how it has worked, unless you know how the code works and you can prove this is a bug i don't see why it would be a bug.
Comment 11 Albert Astals Cid 2016-04-28 21:01:56 UTC
Git commit a961e34924647fb1aa06a6abd0d8fb5147b6f624 by Albert Astals Cid.
Committed on 28/04/2016 at 21:01.
Pushed by aacid into branch 'Applications/16.04'.

Cache idsToSave

Seems doing all the work each time shouldDeleteId is called is not as fast as i thught, so cache it until the next list job finishes (i.e. we're checking email again).

Acked by Laurent

REVIEW: 127775

M  +52   -46   resources/pop3/pop3resource.cpp
M  +4    -0    resources/pop3/pop3resource.h

http://commits.kde.org/kdepim-runtime/a961e34924647fb1aa06a6abd0d8fb5147b6f624
Comment 12 Martin Steigerwald 2016-04-28 21:54:12 UTC
Albert,

1) It is not me against you. I want to help. If I offended you somehow I am sorry, but I do not get how I might have done so.

2) In the attached akonadi resource rc I see the same value 00314d67446ce738 in seenUidList list repeated a ton of times – easily 1000 times or more – and the same value in 1460469054 in seenUidTimeList repeated a ton of times as well. I am no programming expert, I don´t know the code, but I just don´t get on how this is not a duplicate and how it can be necessary to save the same value over and over and over again to produce a 100 KiB file. If that displays proper functioning of the POP3 resource, feel free to close this bug again, but for me it does not look like efficiently handling this issue. If it would be a list of different values I would not say anything, but it is the exact same value over and over and over again which creates a very long line (and which makes a nice pattern when displaying the file contents with a fixed width font).
Comment 13 Albert Astals Cid 2016-04-28 22:26:07 UTC
Ok, i got confused by Laurent pointing to seenUidTimeList, so the problem is duplicated seenUidList values, yes that's not good.

Is this something that keeps happening? Or is it something that broke at some point and now works?

Because the code is pretty explicit in doing 

        if (!seenUIDs.contains(uid))
            seenUIDs.append(uid);

So it seems quite strange to me that this would break.

Also are you compiling things for yourself? Which branch?
Comment 14 Martin Steigerwald 2016-04-30 10:46:02 UTC
Albert, seenUidTimeList also contains duplicated values, but if its the value in which it saw the uid in question and it saw them all at the same time, it may be correct. I don´t know how the code works. I would have expected to see different time stamps there as well.

As I first saw this happening, I removed the config file, and had it redownload everything – working manually to remove all duplicated mails. It again filled the config file with duplicate values. Then I worked around this by disabling leaving mails on server. As I had some duplicates then still for whatever reason, I enabled leaving mails on server again and it again filled the config file with duplicate values. Then I retried to disable leaving mails on server and this works okayish since then.

So I am pretty confident with the git master branch I self-compiled KF5 + KDEPIM on top of rest from Debian Sid/Experimental packages from, this is repeatable. I currently use:

- bf3b97e3e00706f6f743f186766f6e54152783c1 of kdepim
- 3badd8e0c8c4acd80c8d15068e9bb68670366459 of kdepim-runtime
- 77fc3fff200f2647b01a91e996e080beb1d9125c of akonadi

I intent to update everything once I have a working copy of Qt 5.6 + dev files on my system. I intend to wait for distro packages of 5.6.1, which the Qt/KDE team of Debian hopefully prepares beginning of May as I do not want to take the extra effort and time to work out how to self-compile Qt and configure to use self-compiled Qt as well in addition to self-compiling KF5 + KDEPIM at the moment.
Comment 15 Albert Astals Cid 2016-05-04 22:31:13 UTC
Could you try to use Applications/15.06 instead of master to see if it's a bug introduced in master?
Comment 16 Martin Steigerwald 2016-05-05 09:50:03 UTC
Hi Albert, from reading about branch and branch-group in kdesrc-build doc¹ I do not get what to tell kdesrc build to build Applications/15.06 for me instead of master.

In branch-group I can specify what modules to build, but not the branch (i.e. version). In "branch"  I can specify what branch to build, but not the versions to use. How do I turn

  1 global
  2 
  3     #qtdir /path/to/custom/qt          # Uncomment if you have your own qt build
  4     source-dir  /home/kde/sources
  5     build-dir  /home/kde/build
  6     kdedir  /home/kde/install
  7     log-dir  /home/kde/logs
  8 
  9     git-repository-base     kde-projects kde:                                                                                                                           
 10     branch-group            kf5-qt5
 11     
 12     cmake-options -DCMAKE_BUILD_TYPE:STRING=debug
 13     
 14     cxxflags -pipe -DQT_STRICT_ITERATORS -DQURL_NO_CAST_FROM_STRING -DQT_NO_HTTP -DQT_NO_FTP -Wformat -Werror=format-security -Werror=return-type -Wno-variadic-macros -    Wlogical-op -Wmissing-include-dirs
 15 
 16     # If you want to use ninja instead of make (it's faster!), add -GNinja to cmake-options above
 17     # and uncomment the next line
 18     #custom-build-command ninja
 19     
 20     # Adjust to the number of  CPU cores
 21     make-options -j4
 22     
 23     ignore-kde-structure    true        # Downloads all modules directly into the source folder instead of subdirectories
 24     #stop-on-failure         true        # Stop kdesrc-build when a build fails.
 25     
 26 end global
 27 
 28 include /home/kde/sources/kdesrc-build/kf5-frameworks-build-include
 29 #include /home/kde/sources/kdesrc-build/kf5-workspace-build-include
 30 
 31 #Uncomment the next two lines to build application and PIM modules
 32 #include /home/kde/sources/kdesrc-build/kf5-applications-build-include
 33 include /home/kde/sources/kdesrc-build/kf5-kdepim-build-include

into building Applications/15.06 for me?

I did

diff --git a/kf5-kdepim-build-include b/kf5-kdepim-build-include
index 114d746..7242b02 100644
--- a/kf5-kdepim-build-include
+++ b/kf5-kdepim-build-include
@@ -27,6 +27,7 @@ module-set kde-pimlibs
 
     # The PIM Baloo code is now in akonadi-search
     use-modules kde/pim prison kdepimlibs akonadi-search libkgapi
+    branch Applications/15.06
 end module-set
 
 module libkolab
@@ -37,5 +38,6 @@ end module
 module-set kde-pim
     repository kde-projects
     use-modules kde/kdepim kdepim-runtime
+    branch Applications/15.06
 end module-set


but I get:

martin@merkaba:/home/kde/sources/kdesrc-build#1> ./kdesrc-build 
Updating kde-build-metadata (to branch master)
 * Downloading projects.kde.org project database...
 * Module kde/pim is apparently XML-based, but contains no
active modules to build!
        Although no active modules are available, there were
        72 inactive modules. Perhaps the git modules are not ready?
 * Module prison is apparently XML-based, but contains no
active modules to build!
        Although no active modules are available, there were
        1 inactive modules. Perhaps the git modules are not ready?
 * Module kdepimlibs is apparently XML-based, but contains no
active modules to build!
        Although no active modules are available, there were
        1 inactive modules. Perhaps the git modules are not ready?
 * Module akonadi-search is apparently XML-based, but contains no
active modules to build!
        Although no active modules are available, there were
        1 inactive modules. Perhaps the git modules are not ready?
 * Module libkgapi is apparently XML-based, but contains no
active modules to build!
        Although no active modules are available, there were
        1 inactive modules. Perhaps the git modules are not ready?
No modules were defined for the module-set kde-pimlibs
You should use the use-modules option to make the module-set useful.
 * Module kde/kdepim is apparently XML-based, but contains no
active modules to build!
        Although no active modules are available, there were
        1 inactive modules. Perhaps the git modules are not ready?
 * Module kdepim-runtime is apparently XML-based, but contains no
active modules to build!
        Although no active modules are available, there were
        1 inactive modules. Perhaps the git modules are not ready?

So I have no idea on how to accomplish the necessary build for now.
Comment 17 Albert Astals Cid 2016-05-09 22:00:04 UTC
Sorry the proper branch name is Applications/15.04 :D

Can you try if fixing that helps? I've no idea on how to use kdesrc-build so can't really help there.
Comment 18 Albert Astals Cid 2016-05-09 22:41:11 UTC
Lol i'm stupid and can't write

Applications/16.04 

That's the right branch name
Comment 19 Michael Pyne 2016-05-09 22:48:02 UTC
Also, you can try "branch-group stable-kf5-qt5" in the global section if you don't mind being off the bleeding edge a little bit for all those git modules.

Alternately, "branch-group stable-kf5-qt5" can be placed within the KDE PIM-related module-sets (in place of the branch "Application/16.04" you would otherwise add).

Just don't forget to run kdesrc-build with --refresh-build after you make that change to avoid needless CMake confusion.
Comment 20 Martin Steigerwald 2016-05-10 08:31:12 UTC
Thanks for the support, Michael. I never was clear on what to put into branch-group to get the desired result. I somehow managed to have kdesrc-build to build from an Applications branch.

Now rebuilding with globally with stable-kf-qt5 to have consistent versions for everything. Will obviously take a while to complete even on Sandybridge i5 dualcore, 16 GiB RAM and Dual SSD BTRFS RAID 1 (or especially as the BTRFS filesystem may not be running at optimal performance for being near full).
Comment 21 Martin Steigerwald 2016-05-10 12:50:54 UTC
This still happens with Applications/16.04. Since starting Akonadi + KMail 16.04 + setting to leave mail on server for one day it received six mails. This is the result:

martin@merkaba:~/.config> cat akonadi_pop3_resource_0rc
[General]
host=[…]
intervalCheckEnabled=true
intervalCheckInterval=120
leaveOnServer=true
leaveOnServerDays=1
login=martin
pipelining=true
targetCollection=361
useTLS=true

[LeaveOnServer]
seenUidList=00320035446ce738,00320036446ce738,00320037446ce738,00320038446ce738,00320039446ce738,0032003a446ce738
seenUidTimeList=1462884349,1462884349,1462884349,1462884349,1462884392,1462884437

I´d expect the Uids and Times to be *different* for each item, but they are the same again.

I checked it and for each mail it receives is just adds the same value to each of the lists again, which doesn´t make any sense to me.
Comment 22 Martin Steigerwald 2016-05-10 12:52:49 UTC
Hmmm, no, the times are different. Oh, they have always been, just the same of each retrieval attempt it seems. I completely missed that all the time cause they looked all the same, as I usually have several hundred mails on each download since I switched it to a 2 hour interval. Seems it gets the time once for each "download mail" operation. But the uids are all the same.
Comment 23 Albert Astals Cid 2016-05-10 22:34:24 UTC
[LeaveOnServer] seenUidList=00320035446ce738,00320036446ce738,00320037446ce738,00320038446ce738,00320039446ce738,0032003a446ce738 seenUidTimeList=1462884349,1462884349,1462884349,1462884349,1462884392,1462884437 I´d expect the Uids and Times to be *different* for each item, but they are the same again. 


As i told you, the times don't matter at all, and the uids are different, so no idea what you're complaining about.

This is your list of uids
00320035446ce738
00320036446ce738
00320037446ce738
00320038446ce738
00320039446ce738
0032003a446ce738
Comment 24 Martin Steigerwald 2016-05-11 13:58:36 UTC
You are completely right, Albert. The uid values are different. From the *beginning*. I.e. even in the pop3 resource config file I uploaded. I didn´t see this all the time as the beginning and the end has been the same. The brain can read through a book where the middle letters of a word are mixed up while the beginning and end letters are still right… seems I successfully tricked myself into perceiving the uids were duplicate all the time while they were not.

Sorry for all the noise about the duplicates. I just didn´t recognize that there never where duplicates. Still the slowdown was for real. So retitling. I am now enabling leave mails on server for 2 days again. If that works, I extend to 7 days and observe performance. Maybe the performance regression has been implemented after Applications/16.04.  I still cannot compile from master as even tough I have Qt 5.6.0 now on my Debian box, the qt web engine cmake files and headers are not yet packaged it seems, at least configuring messagelib, kdepim-runtime and kdepim fails due to these missing.