Bug 451620 - akonadictl; sqlite: Error moving orphan items to collection 242 : Expression tree is too large (maximum depth 1000)
Summary: akonadictl; sqlite: Error moving orphan items to collection 242 : Expression ...
Status: REPORTED
Alias: None
Product: Akonadi
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 5.24.5
Platform: Gentoo Packages Linux
: NOR normal
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-03-17 12:54 UTC by Erik Quaeghebeur
Modified: 2024-11-01 07:24 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Erik Quaeghebeur 2022-03-17 12:54:57 UTC
SUMMARY
I use akonadi with the sqlite backend. Running akonadictl results in an error that seems to result from akonadi giving sqlite a query that is too long:

---
$ akonadictl fsck
[…]
Checking collection tree consistency...
Looking for items not belonging to a valid collection...
Found 110689 orphan items.
Error moving orphan items to collection 242 : Expression tree is too large (maximum depth 1000) Kon statement niet uitvoeren
[…]
---
("Kon statement niet uitvoeren" means "Could not execute statement")

STEPS TO REPRODUCE
1. run akonadictl fsck again

OBSERVED RESULT
Error moving orphan items to collection 242 : Expression tree is too large (maximum depth 1000) Kon statement niet uitvoeren

EXPECTED RESULT
orphans are cleaned up

SOFTWARE/OS VERSIONS
KDE Plasma Version: 5.23.5
KDE Frameworks Version: 5.90.0
Qt Version: 5.15.2 (with https://community.kde.org/Qt5PatchCollection)

ADDITIONAL INFORMATION
I think this is sqlite-specific, given some search results on the web
Comment 1 Erik Quaeghebeur 2022-03-20 12:37:32 UTC
I have investigated a bit further. Using a GUI for sqlite (sqlitebrowser), I familiarized myself a bit with the database and then looked into what was going on here:

---
select distinct "collectionId" as "id" from "PimItemTable"
except
select distinct "id" from "CollectionTable"
order by "id";
---
This returned 52 collections that where referenced by items, but apparently do not exist anymore.

---
select "id" as "itemId", "collectionId" from "PimItemTable"
where "collectionId" not in (select distinct "id" from "CollectionTable")
order by "collectionId";
---
This returned 110689 items, the same as reported by akonadictl, so indeed these were the orphans mentioned.

---
delete from "PimItemTable"
where "collectionId" not in (select distinct "id" from "CollectionTable");
---
To fix the issue, I just removed them rather than creating a lost+found(?) collection (with id 242?) and updating the collectionId of the items to 242. After vacuuming (‘compress database’ under ‘Extra’ menu in sqlitebrowser), my database shrunk to less than half the size it had before (600+ to ~250 MB).

So what seems to go wrong is that in the code *the list of orphans* resulting from a first query is used to build a second query to change their collectionId. Because this list is absurdly long, sqlite bails out. A possible fix would be to first find out if there are orphan collections (my first query listed) and if so, create as needed the lost+found collection and update the corresponding items in the way done by my last query above.

Next, I dived into the code. I think the function in question can be found at https://invent.kde.org/pim/akonadi/-/blob/master/src/server/storagejanitor.cpp#L277. There, one can indeed see that a list of orphans is created on the C++ side (in the variable imapIds) and subsequently used to create the query to clean up the orphans. While the Qt functions for creating queries are mostly gibberish to me, what I can deduce, is that this is done in a roundabout, inefficient way, as compared to what my queries above do. Is there a reason for this?

Finally, a point that must be addressed is why such orphans can even exist, as in the database schema for the creation of PimItemTable, I see:

…
collectionId BIGINT
…
CONSTRAINT PimItemTablecollectionId_Collectionid_fk FOREIGN KEY (collectionId) REFERENCES CollectionTable(id) ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
…

which should in principle cascade the deletion of the collection to the deletion of items within that collection. I do not know what could have gone wrong, but it is worrying.

N.B.: The above column definition+constraint can more compactly be done as

collectionId BIGINT REFERENCES CollectionTable(id) ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED

using the concept of column constraint: https://www.sqlite.org/syntax/column-constraint.html.
Comment 2 Erik Quaeghebeur 2024-04-13 14:03:30 UTC
This is still an issue in latest KDE 5 versions. (I again had to apply my manual SQL fix.)

I believe this issue should be taken seriously. It points to a a sometimes very inefficient approach to using SQL in Akonadi's code. Namely, that things which are easy/efficient to do in SQL are instead done inefficiently in C++.
Comment 3 Christophe Marin 2024-04-17 06:59:53 UTC
(In reply to Erik Quaeghebeur from comment #2)
> This is still an issue in latest KDE 5 versions. (I again had to apply my
> manual SQL fix.)

If you mean akonadi < 24.02, it won't get the fix. This branch is closed, you need to backport the change manually.
Comment 4 Erik Quaeghebeur 2024-04-17 08:35:57 UTC
(In reply to Christophe Marin from comment #3)
> (In reply to Erik Quaeghebeur from comment #2)
> > This is still an issue in latest KDE 5 versions. (I again had to apply my manual SQL fix.)
> 
> If you mean akonadi < 24.02, it won't get the fix. This branch is closed,
> you need to backport the change manually.
Ah, nice, it seems the great commit https://invent.kde.org/pim/akonadi/-/commit/fcc37ce297df718430f83f7feb669573c9b135f2?page=3#995e689d409482037d49b6b0893848ff42e9f64f likely fixed the issue. (The function is now at https://invent.kde.org/pim/akonadi/-/blob/master/src/server/storagejanitor.cpp#L301.)

AFAIU, It does seem though that the fix still passes through a C++ data structure and uses a two-step approach. Namely, first a list of orphans is queried, it is copied to a QList (https://invent.kde.org/pim/akonadi/-/blob/master/src/server/storagejanitor.cpp#L325), which is fed back into a second query that sets the orphaned items' collectionId to the lost+found one. This might be done more efficiently by directly selecting anything that qualifies as an orphan item using SQL directly in the WHERE of the UPDATE statement (cf. https://www.sqlite.org/lang_update.html, like in the DELETE statement in my Comment #1).
Comment 5 Andreas Sturmlechner 2024-07-15 10:58:29 UTC
Erik, did you have a chance to test this with akonadi-24.05.2 and can we consider this fixed?
Comment 6 Erik Quaeghebeur 2024-10-13 13:18:22 UTC
(In reply to Andreas Sturmlechner from comment #5)
> Erik, did you have a chance to test this with akonadi-24.05.2 and can we
> consider this fixed?

No, it is not fixed, I hit it again with 24.08.1.

I again delved into the code. The function is at https://invent.kde.org/pim/akonadi/-/blob/master/src/server/storagejanitor.cpp#L300. There, as before, a list of orphan items is created, at https://invent.kde.org/pim/akonadi/-/blob/master/src/server/storagejanitor.cpp#L309. Then this list is used to create a second query in which the possibly too large list is injected, at https://invent.kde.org/pim/akonadi/-/blob/master/src/server/storagejanitor.cpp#L322.

The right approach is, AFAIU:
1. create a query to count how many orphans there are;
2.  there are orphans, to create a second query again selecting the orphans and assigns them to the/a lost+found collection.

Do not save the orphans in a list. SQL engines will be far more efficient by using direct queries whenever you can and not passing through one's own data structures.