Bug 472282 - Kolab Now reports too many open connections
Summary: Kolab Now reports too many open connections
Status: RESOLVED UPSTREAM
Alias: None
Product: Akonadi
Classification: Frameworks and Libraries
Component: IMAP resource (show other bugs)
Version: 5.23.2
Platform: Slackware Linux
: NOR major
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-07-15 14:38 UTC by Heinz Wiesinger
Modified: 2024-02-04 13:39 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Screenshot of the error message (92.40 KB, image/png)
2023-07-15 14:38 UTC, Heinz Wiesinger
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Heinz Wiesinger 2023-07-15 14:38:46 UTC
Created attachment 160308 [details]
Screenshot of the error message

SUMMARY

Kolab Now performed some server side changes on Friday, June 30th 2023:
https://blogs.kolabnow.com/2023/06/16/announcing-service-window-friday-23rd-of-june-the-everlasting-race-to-get-rid-of-old-infrastructure
https://blogs.kolabnow.com/2023/06/23/service-window-postponed-due-diligence

Ever since then, the mail synchronization process in akonadi is acting weird and running into problems, the most visible of which as an error popup that login was denied because of "too many open connections".

I collected a log with akonadiconsole which highlights what might be going wrong. I'll post what I think are the relevant excerpts here. The full log contains a bit too many personal details for my liking, but I can provide it privately if you need it :)

The synchronization process actually works fine. Login is successful, new emails are synced, etc. However, at the end of the synchronization process of the first folder I see this in the log:

```
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.kimap KIMAP::LoginJob(0x2890c50)
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.kimap Connecting to IMAP server with no proxy
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.kimap connectToHost "imap.kolabnow.com" 143
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.kimap KIMAP::LoginJob(0x28d4500)
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.kimap Connecting to IMAP server with no proxy
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.kimap Connection to server lost  QAbstractSocket::UnknownSocketError
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.kimap close
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.kimap connectToHost "imap.kolabnow.com" 143
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.akonadiagentbase Deferring task "19530 Custom startConnect "
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.imapresource online= false
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.kimap KIMAP::LoginJob(0x2890c50)
[2023-07-15T13:28:43.381Z] akonadi_imap_resource org.kde.pim.kimap close
[2023-07-15T13:28:43.383Z] akonadi_imap_resource org.kde.pim.kimap Connection to server lost  QAbstractSocket::UnknownSocketError
[2023-07-15T13:28:43.384Z] akonadi_imap_resource org.kde.pim.akonadiagentbase Deferring task "19530 Custom startConnect "
[2023-07-15T13:28:43.384Z] akonadi_imap_resource org.kde.pim.kimap KIMAP::LoginJob(0x28d4500)
[2023-07-15T13:28:43.385Z] kmail org.kde.pim.kmail Last resource finished syncing, mail check done
[2023-07-15T13:28:43.385Z] akonadi_imap_resource org.kde.pim.kimap Connecting to IMAP server with no proxy
[2023-07-15T13:28:43.385Z] akonadi_imap_resource org.kde.pim.kimap connectToHost "imap.kolabnow.com" 143
[2023-07-15T13:28:43.385Z] akonadi_imap_resource org.kde.pim.kimap KIMAP::LoginJob(0x2873850)
[2023-07-15T13:28:44.286Z] akonadi_imap_resource org.kde.pim.kimap Received BYE:  "LOGOUT received "
[2023-07-15T13:28:44.287Z] akonadi_imap_resource org.kde.pim.kimap close
```

for later folders I see
```
[2023-07-15T13:31:27.908Z] akonadi_imap_resource org.kde.pim.akonadicore Received:  0 Removed:  0 In total:  0  Wanted:  127
[2023-07-15T13:31:27.908Z] akonadi_imap_resource org.kde.pim.imapresource Retrieval complete. Elapsed(ms):  547
[2023-07-15T13:31:27.908Z] akonadi_imap_resource org.kde.pim.akonadicore ItemSync of collection 406 finished
[2023-07-15T13:31:27.909Z] akonadi_imap_resource org.kde.pim.imapresource.trace RetrieveCollectionMetadataTask
[2023-07-15T13:31:27.909Z] akonadi_imap_resource org.kde.pim.imapresource.trace starting:  RetrieveCollectionMetadataTask
[2023-07-15T13:31:27.909Z] akonadi_imap_resource org.kde.pim.imapresource "/folder1"
[2023-07-15T13:31:35.468Z] kmail org.kde.pim.kmail Last resource finished syncing, mail check done
[2023-07-15T13:31:38.467Z] kmail kf.xmlgui Index  23  is not within range (0 -  21 )
[2023-07-15T13:31:57.470Z] akonadi_imap_resource org.kde.pim.kimap Socket timeout!
[2023-07-15T13:31:57.470Z] akonadi_imap_resource org.kde.pim.kimap close
[2023-07-15T13:31:57.471Z] akonadi_imap_resource org.kde.pim.imapresource online= false
```

and this happens for every folder until eventually the error pops up.

Now, not saying this is an akonadi bug. I just lack the inside to figure out where things are going wrong, and kolab now's support is telling me (and I heard also others) to change an akonadi setting that, as far as I can tell, doesn't exist anymore (Max number of cached server connections). So I hope this provides some insight into figuring out what the actual problem here is.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Slackware64-current
(available in About System)
KDE Plasma Version: 5.27.6
KDE Frameworks Version: 5.107.0
Qt Version: 5.15.10 (d2861d63)
Comment 1 Andreas Cord-Landwehr 2023-08-06 16:25:41 UTC
I had a slightly deeper look into this issue because I can also reproduce this behavior with the Kolabnow.com server:
Using the KIMAP_LOGFILE logging option when restarting Akonadi I monitored the socket disconnects and in every single connection loss I see a "GETMETADATA (DEPTH infinity) operation as last operation before losing connection, here one example log tail:

[...]
S: * OK Ok [ ANNOTATIONS 65536  ]
S: A000010 OK Completed [ READ-WRITE  ]
C: A000011 UID FETCH 1:26193 (FLAGS UID) (CHANGEDSINCE 39247)
S: * 24503 FETCH ( FLAGS (\Seen) UID 26164 MODSEQ (39273) )
S: * 24504 FETCH ( FLAGS (\Seen) UID 26165 MODSEQ (39271) )
S: * 24505 FETCH ( FLAGS (\Seen) UID 26166 MODSEQ (39272) )
S: * 24506 FETCH ( FLAGS (\Seen) UID 26167 MODSEQ (39274) )
S: * 24507 FETCH ( FLAGS (\Seen) UID 26168 MODSEQ (39270) )
S: * 24508 FETCH ( FLAGS (\Seen) UID 26169 MODSEQ (39269) )
S: * 24509 FETCH ( FLAGS (\Seen) UID 26170 MODSEQ (39267) )
S: * 24510 FETCH ( FLAGS (\Seen) UID 26171 MODSEQ (39268) )
S: * 24511 FETCH ( FLAGS (\Seen) UID 26172 MODSEQ (39266) )
S: * 24512 FETCH ( FLAGS (\Seen) UID 26173 MODSEQ (39276) )
S: * 24513 FETCH ( FLAGS (\Seen) UID 26174 MODSEQ (39260) )
S: * 24514 FETCH ( FLAGS (\Seen) UID 26175 MODSEQ (39264) )
S: * 24515 FETCH ( FLAGS (\Seen) UID 26176 MODSEQ (39275) )
S: * 24516 FETCH ( FLAGS (\Seen) UID 26177 MODSEQ (39265) )
S: * 24517 FETCH ( FLAGS (\Seen) UID 26178 MODSEQ (39257) )
S: * 24518 FETCH ( FLAGS (\Seen) UID 26179 MODSEQ (39263) )
S: * 24519 FETCH ( FLAGS (\Seen) UID 26180 MODSEQ (39261) )
S: * 24520 FETCH ( FLAGS (\Seen) UID 26181 MODSEQ (39258) )
S: * 24521 FETCH ( FLAGS (\Seen) UID 26182 MODSEQ (39259) )
S: * 24522 FETCH ( FLAGS (\Seen) UID 26183 MODSEQ (39256) )
S: * 24523 FETCH ( FLAGS (\Seen) UID 26184 MODSEQ (39262) )
S: * 24524 FETCH ( FLAGS (\Seen) UID 26185 MODSEQ (39255) )
S: * 24525 FETCH ( FLAGS (\Seen) UID 26186 MODSEQ (39250) )
S: * 24526 FETCH ( FLAGS (\Seen) UID 26187 MODSEQ (39251) )
S: * 24527 FETCH ( FLAGS (\Seen) UID 26188 MODSEQ (39252) )
S: * 24528 FETCH ( FLAGS (\Seen) UID 26189 MODSEQ (39253) )
S: * 24529 FETCH ( FLAGS (\Seen) UID 26190 MODSEQ (39248) )
S: * 24530 FETCH ( FLAGS (\Seen) UID 26191 MODSEQ (39254) )
S: * 24531 FETCH ( FLAGS (\Seen) UID 26192 MODSEQ (39249) )
S: A000011 OK Completed ( 0.001 sec )
C: A000012 GETMETADATA (DEPTH infinity) "KDE/Devel" (/shared)
X
Comment 2 Andreas Cord-Landwehr 2023-08-06 17:45:16 UTC
After some more debug outputs, essentially uncommenting the GOT/SENT debug lines in getmetadatajob.cpp and increasing the debug level to debug for kimap, I am very sure that the Kolab server behaves in a way that some calls to "GETMETADATA (DEPTH infinity)" never are answered and thus the socket timeout is hit. What is quite interesting, in the same IMAP session the exact same call, eg

C: A000008 GETMETADATA (DEPTH infinity) "KDE/Community" (/shared)
S: * METADATA KDE/Community ( /shared/vendor/cmu/cyrus-imapd/server imapb036.mykolab.com )
S: A000008 OK Completed

at one point succeeds and a few seconds later

C: A000015 GETMETADATA (DEPTH infinity) "KDE/Community" (/shared)

never is answered (I modified getmetadatajob such that timeout is set to infinity and now after waiting 15 minutes, I am very sure that the sever never will answer).
@heinz, do you see the same behavior?
With my limited IMAP knowledge, as far as I can see, here is not much that can be done on the client Akonadi side other than some crude hacks that retry the getmetadatajob once a socket timeout is reached instead of tearing down the full connection.
Comment 3 Andreas Cord-Landwehr 2023-08-07 18:42:11 UTC
OK, I found a horrible workaround to not trigger this error: I did a local patch that filters the "METADATA" capability out of the announced IMAP server capabilities (before handing them over to KMail) and it seems to work well in terms of not leading to infinite server answers. At the downside, obviously this IMAP feature is not functional anymore.
The important insight though is that it further pins the problem to the sever side handling it GETMEATADATA requests.
Comment 4 Andreas Cord-Landwehr 2024-02-04 11:52:15 UTC
@heinz, yesterday at FOSDEM I ran into Christoph Molenkopf. On Kolab side they were able to reproduce a crash in their IMAP servers coming from the specific sequence sent by Kontact/Akonadi and apparently could fix it. On my side I do not see this issue anymore and thus would close the defect. Please reopen, if you see it different.
Comment 5 Heinz Wiesinger 2024-02-04 13:39:59 UTC
I can confirm. Started working a couple weeks ago, just didn't come around to updating the relevant bug reports yet.