Summary: | IMAP resource can't handle temporary network connection losses | ||
---|---|---|---|
Product: | [Frameworks and Libraries] Akonadi | Reporter: | Michi <woskimi> |
Component: | IMAP resource | Assignee: | Christian Mollekopf <chrigi_1> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | ach, asem.arafa, asn, asturm, bethovenathome, faure, fest.in, franz.trischberger, hus, itsef-admin, kdepim-bugs, kdudka, lacsilva, mollekopf, sven.burmeister, vkrause, wstephenson |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | openSUSE | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: |
Description
Michi
2010-11-23 18:36:22 UTC
Confirm that bug. On network loss, akonadi imap stuck on mail check. Kmail can't check mail anymore even after network back agaoin. Restarting akonadi solves the issue. Still present in akonadi-server-1.5.3/kde-4.6.3/kdepim-4.6-rc1. Here we have a quite unstable connection, several times per day, the connection crashes. Every time this happens, akonadi needs to be restarted. PLEASE fix this. This is for me one of the most annoying bugs I have to live with. The other two are the buggy addressbook-completion in the kmail-composer and the dvd-drive spinning up always, when the places-sidebar gets shown. Confirm. Very annoying. Cheers, Georg *** Bug 247061 has been marked as a duplicate of this bug. *** *** This bug has been confirmed by popular vote. *** This happened to me once and since then the resource is always stuck in connection established. coolo run into the same issue last week. I've traced the traffic with wireshark. I can see that it is authenticating and it exchanges a lot of information and then get stuck. The resource lacks of debug message. I'm a C developer and the source code was bit confusing when I've looked at it. But if someone can help me debugging we can do it in a IRC or mumble session. *** Bug 272660 has been marked as a duplicate of this bug. *** *** Bug 282486 has been marked as a duplicate of this bug. *** I noticed, that if i leave akonadi enough time, it somehow reestablishes a working connection, again. I can't say, when and how, but i need to wait several minutes after the connection loss. BUT: In akonadi-1.6.1 (I think) must have been a change, that somehow reacts on the changes the connection loss leaves, and tries to roll back or something like that. The effect: some - not all - already read and fully downloaded mails get marked as unread and removed from cache. This means a) confusion about the current state in correspondence (re-reading mails until the obvious gets realised) and re-downloading mails. The last one is the really bad thing here: We have 45kb/s, and larger mails (which arrive quite often) really take a lot of time to download! Re-downloading means loss of time. This problem is really so annoying. I don't know, what to do! Up to more then 30 (!) internet-reconnects a day the last weeks, and every time akonadi-imap refuses to work. akonadictl restart took 20-30 seconds, after that it worked again. But now I need to redownload Megabytes again and again. If there won't be a fix, I need to change email-client - sorry :\ Akonadi looked and looks quite promising, kmail integrates so nice into the desktop. But this problem makes daily-email a gambling. Maybe http://git.reviewboard.kde.org/r/103112/ will help. (In reply to comment #10) > Maybe http://git.reviewboard.kde.org/r/103112/ will help. Did not fix anything here, sorry :( setNeedsNetwork(needsNetwork()) is done already in reconnect(), which gets triggered in quite some places... May be resolved by Sergio's commits to 4.7 branch in kdepim-runtime: b62d1bf7e5e856a5845140aaf2e740bb71e50956 and master: 5d343b97144dbe2a371837111f950dfbdf1fe9cb I'm adding this patch to openSUSE KDE:Distro:Stable kdepim4-runtime to test. Git commit 679a05cb8ee30c81ea207b6cfc0e4131fd1a0c78 by David Faure. Committed on 06/01/2012 at 13:21. Pushed by dfaure into branch 'master'. Restore the default timeout to 30s, after the kdelibs+kdepimlibs fixes. The timeout doesn't trigger by mistake anymore on large emails, so it can be restored to 30s, in order to fix the issues with network disconnections, suspend/resume etc. Related: bug 258378, bug 258271, bug 286047 M +1 -1 resources/imap/imapresource.kcfg http://commits.kde.org/kdepim-runtime/679a05cb8ee30c81ea207b6cfc0e4131fd1a0c78 Does it stil happen? If yes please reopen but David latest patches should have done the trick. (In reply to comment #14) > Does it stil happen? If yes please reopen but David latest patches should have > done the trick. Still the same here :/ When the network goes down while kmail is running, imap-accounts don't work anymore, even if network reestablishes (If I leave kmail, it seems to fix itself when I don't touch it for quite some time, half an hour or so). Also, large Mails are not shown when download finished (the other bug I have with my IMAP-Accounts). I ensured I have a fixed version: kdepim-runtime-4.8.0 ships with "30" as default for the session-timeout. I am running kde-4.8.0 since it was released, Qt-4.8.0, linux-3.2.5. I did a new installation some months ago, and there it happend exactly once, that I got a system-notification saying something like "Network connection lost, IMAP-account will go offline". No message, when the connection reestablished again. Since then I never ever got this notification again. I did NOT install networkmanager (wired connection, static IP -> no need for something like a networkmanager). I really would like to help fix this. Just tell me, what I should do! Sorry, my bad, I was unclear. I meant latest patches in upcoming 4.8.1. It should be out soonish, please test with that one when it is available. Installed 4.8.1 today - nothing changed. Started kontact, let it idle, run into disconnect, wait for reconnection, click an unread mail - nothing happens, just get the "Retrieving Folder Contents"-page. If there are any special steps to help fixing this - I am willing to help! I thought about this again. In the early days of kmail2 this problem was not present. I think it was introduced together with Push-IMAP. Could that be? Could you please provide us the task list and the change notification when your resource is stuck? It now can be done from akonadiconsole, just right click on the resource and use "Show task list" and "Show notification log". Resource 1: ResourceScheduler: Offline current task: 2379 Invalid (no task) queue 0 is empty queue 1 is empty queue 2 is empty queue 3 is empty queue 4 10 tasks: 2376 Custom startConnect 2373 Custom startConnect 2317 SyncCollectionTree 2318 SyncCollection collection 33 2319 SyncCollection collection 34 2320 SyncCollection collection 35 2321 SyncCollection collection 36 2322 SyncCollection collection 37 2323 SyncCollection collection 38 2324 SyncCollection collection 39 notification log is empty. Resource 2: ResourceScheduler: Offline current task: 407 Invalid (no task) queue 0 is empty queue 1 is empty queue 2 is empty queue 3 is empty queue 4 5 tasks: 401 Custom startConnect 398 Custom startConnect 387 SyncCollectionTree 388 SyncCollection collection 40 389 SyncCollection collection 42 notification log is empty. resource_1: ResourceScheduler: Online current task: 137 FetchItem item 16204 queue 0 is empty queue 1 is empty queue 2 is empty queue 3 is empty queue 4 is empty resource_0: ResourceScheduler: Online current task: 96 SyncCollection collection 98 queue 0 is empty queue 1 is empty queue 2 is empty queue 3 1 tasks: 98 FetchItem item 16228 queue 4 1 tasks: 97 Custom triggerCollectionExtraInfoJobs For both resources the notification log was empty. OK, indeed a startConnect which doesn't complete, and for some reason the LoginJob not timing out probably... reopening then. David, any idea for guidance to debug further? All I have in mind is to attach gdb to the resource and check if/where the KIMAP::LoginJob is stuck. Definitely not a trivial task, and at the same time I cannot reproduce here. OK, for me it seems to be fixed mostly. The resources in my last post were not "stuck", It was waiting the 30 seconds timeout (I think). Kmail just didn't show the message (probably related to https://bugs.kde.org/show_bug.cgi?id=266429). But I still can make at least my web.de-account stuck: When connection ist lost, go to the inbox, select some messages. When connection comes up again, the resource stays Offline. There were several possibilities after that: 1) 30 seconds timeout -> Connection established -> Ready ->not working: ResourceScheduler: Offline current task: 20 Invalid (no task) queue 0 is empty queue 1 is empty queue 2 is empty queue 3 is empty queue 4 2 tasks: 15 SyncCollection collection 69 16 Custom triggerCollectionExtraInfoJobs AgentBase(akonadi_imap_resource_1): Cannot fetch item in offline mode. AgentBase(akonadi_imap_resource_1): Cannot fetch item in offline mode. AgentBase(akonadi_imap_resource_1): Cannot fetch item in offline mode. AgentBase(akonadi_imap_resource_1): Cannot fetch item in offline mode. AgentBase(akonadi_imap_resource_1): Cannot fetch item in offline mode. AgentBase(akonadi_imap_resource_1): Cannot fetch item in offline mode. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_0): Connection lost AgentBase(akonadi_imap_resource_0): Connection to server lost. NotificationManager::notify ( Collection (98, /INBOX) in collection 94 modified parts (timestamp) ) AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. NotificationManager::notify ( Collection (69, /INBOX) in collection 12 modified parts (timestamp) ) AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. AgentBase(akonadi_imap_resource_1): There is currently no session to the IMAP server available. (resource_0 was working, as I did not touch it while connection was down) Then I have to restart akonadi (akonadictl restart), it does not help to just turn the resource offline and online again. 2) It just stays "Offline", even after waiting 30 seconds. Manually turning it Online: 2.1) nothing happens (trying to connect without any success) 2.2) It connects, says "Connection established" but does not turn "Ready" 2.3) -> see 1) For 2) I don't have a task list, as it happend during testing, what I can do, how to reproduce certain behaviour. In any case, the web.de-resource stays unusable (reproducable!), when I start using it while there is no connection. If I don't touch kmail while connection is down, everything works great after connection was reestablished. I will further look, if I can get tasklists for 2). But it is more difficult ATM. Some weeks ago I had dozens of connection losses a day, somtimes 10 within half an hour (in the evening). But since I try to find out how it behaves, things got better connection-wise, get 3-5 connection losses per day. :( I know, I should be happy :D Ok, on the other PC I got this: ResourceScheduler: Offline current task: 7 Invalid (no task) queue 0 1 tasks: 4 Custom startConnect queue 1 1 tasks: 2 ChangeReplay queue 2 is empty queue 3 is empty queue 4 2 tasks: 3 Custom startConnect 5 SyncCollection collection 87 Notification list: session=plasma-desktop-495252308 type=Item operation=Add uid=17677 remoteId= resource=akonadi_imap_resource_7 parentCollection=103 parentDestCollection=-1 mimeType=text/x-vnd.akonadi.note itemParts= session=plasma-desktop-495252308 type=Item operation=Add uid=17678 remoteId= resource=akonadi_imap_resource_7 parentCollection=103 parentDestCollection=-1 mimeType=text/x-vnd.akonadi.note itemParts= gdb attached to the resource-process (gdb) bt #0 0x00007f17f76d51d3 in poll () from /lib64/libc.so.6 #1 0x00007f17f64165b8 in g_main_context_iterate.isra.21 () from /usr/lib64/libglib-2.0.so.0 #2 0x00007f17f6416a3b in g_main_context_iteration () from /usr/lib64/libglib-2.0.so.0 #3 0x00007f17fb1fb636 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib64/qt4/libQtCore.so.4 #4 0x00007f17fa5f4326 in QGuiEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib64/qt4/libQtGui.so.4 #5 0x00007f17fb1cbb12 in QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib64/qt4/libQtCore.so.4 #6 0x00007f17fb1cbd97 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib64/qt4/libQtCore.so.4 #7 0x00007f17fb1d08a5 in QCoreApplication::exec() () from /usr/lib64/qt4/libQtCore.so.4 #8 0x00007f17fb90ce03 in Akonadi::ResourceBase::init(Akonadi::ResourceBase*) () from /usr/lib64/libakonadi-kde.so.4 #9 0x000000000041cb36 in int Akonadi::ResourceBase::init<ImapResource>(int, char**) () #10 0x00007f17f762742d in __libc_start_main () from /lib64/libc.so.6 #11 0x00000000004173f5 in _start () This happens to ALL IMAP-Resources. There also is used the same Web.de-Account, as I described in the last post. But here, triggering "Toggle Online/Offline" immediately starts the resource and makes it usable! What I can see: There are two "Custom startConnect"-tasks. S.Burmeister also had two. Does "startConnect" connect to the server? Probably the server just can't/won't handle two simultaneous connects? Hm, I might have a theory... Could you tell me which auth scheme you're using? If that's not the clear text authentication, could you try it for a while and tell me if you can reproduce? At that point I'm thinking about a LoginJob getting stuck, and since here we're not in full control for some of the authentication modes (that it all of them except clear text) could be that it needs extra care... The 30 seconds timeout we set on the session might not be enough to detect the inactivity. (In reply to comment #25) > Hm, I might have a theory... Could you tell me which auth scheme you're > using? If that's not the clear text authentication, could you try it for a > while and tell me if you can reproduce? OK, was set to PLAIN. Set them to Clear Text now. BTW.: Changing the Athentication scheme for one Account will change it for all others, too. Is that intended? Nope, sounds wrong to me. I don't remember seeing that happening though... definitely strange. (In reply to comment #27) > Nope, sounds wrong to me. I don't remember seeing that happening though... > definitely strange. Probably an issue with my akonadi-configs from kde-4.7/4.6? Because now I manually changed settings, saved them and it seems I can't reproduce it. If I can, I will open a reperate bugreport. I have turned all resources to Clear Text Authentication. Now I got this after a connection loss: ResourceScheduler: Offline current task: 71 Invalid (no task) queue 0 is empty queue 1 is empty queue 2 is empty queue 3 is empty queue 4 1 tasks: 64 Custom triggerCollectionExtraInfoJobs I had to manually toggle Online through akonadiconsole, then it became usable, again. OK, thanks for trying. Kind of confirm my theory... Now the difficult point will be to find out why the others auth scheme create such an issue and how to work around it... BTW.: Resources still hang sometimes with this message printed out every time I trigger an action that requires a connection to the server: AgentBase(akonadi_imap_resource_0): There is currently no session to the IMAP server available. although the resource says "Ready" and the icon shows "connected". I need to restart akonadi in order to make it work again. With Authentication set to Clear Text I had a funny issue sometimes. akonadiconsole prints "Ready", but the icons says "disconnected". How could that be? Argh, got this now with all IMAP accounts, still all with Clear text: ResourceScheduler: Offline current task: 144 Invalid (no task) queue 0 1 tasks: 141 Custom startConnect queue 1 is empty queue 2 is empty queue 3 is empty queue 4 2 tasks: 140 Custom startConnect 142 SyncCollection collection 69 I started kontact while connection was down, probably lost seconds before. Manually turning each resource online makes them usale again. I viewed the task list while connection was down, and the Custom startConnects already were there. I waited several minutes for a possible timeout, but nothing happend, so really stuck. KDE SC 4.8.2 with auth scheme PLAIN: ResourceScheduler: Offline current task: 2667 Invalid (no task) queue 0 is empty queue 1 is empty queue 2 is empty queue 3 is empty queue 4 9 tasks: 2664 Custom startConnect 2656 SyncCollectionTree 2657 SyncCollection collection 8 2658 SyncCollection collection 9 2659 SyncCollection collection 10 2660 SyncCollection collection 11 2661 SyncCollection collection 12 2662 SyncCollection collection 13 2663 SyncCollection collection 14 Still valid for 1.8.0. Another culprit could be that it has to "re-connect" as well if e.g. a new vpn connection was established. Still there in 1.8.1 (in fact, this never has worked for me since the introduction of akonadi). If there is a network problem that doesn't affect the local connection, akonadi IMAP ressources go permanently offline. Status usually is "Offline, Running (0%)". Dis- and reconnecting the local network interface restarts all IMAP connections at once, as does clicking "Check Mail" in KMail. A closed wallet is not a problem. Setting authentication to "Clear text" and encryption to "None" doesn't seem to make any difference. Perhaps the "running" part of the status keeps the ressource from restarting? Just hit the same bug with 1.9.1 it seems. First, it was stuck forever on fetching a folder, now it is stuck forever trying to abort. The IMAP resource has a new maintainer, reassigning to him. I think this is fixed in master. I fixed the bug where kimap got into a reconnect frenzy on connection loss in 4.11.4, and switching to a vpn and back works for me. Let me know if there are other issues. I wonder whether the fix you mentions also has impact on https://bugs.kde.org/show_bug.cgi?id=327513... If you think that is the case: Which version of akonadi should this be fixed in, then? On the download page for 4.11.4, there seems no source for akonadi present? I'd be willing to build my own packages for kontact (what else? akonadi, I presume?) so I can test this, as it will probably be a while before Kubuntu offers a full update to 4.11.4 for Kubuntu 13.10 (they haven't even released 4.11.3 yet, as far as I can see). I'm already at akonadi 1.10.80. The reconnect frenzy was kdepimlibs 4b8adabf. I also fixed a bunch of crashes in the imap resource in the master branch of kdepim-runtime that typically occurred during disconnects, so that might fix you problems regarding 327513, but since I don't know what's happening there I can't tell for sure. |