Bug 389607 - akonadictl fsck: Show more helpful messages
Summary: akonadictl fsck: Show more helpful messages
Status: CONFIRMED
Alias: None
Product: Akonadi
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-29 16:38 UTC by Martin Steigerwald
Modified: 2019-04-27 18:00 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Steigerwald 2018-01-29 16:38:46 UTC
Currently the error messages of akonadictl fsck do not really indicate the issue at hand to a user who does not understand the internals of Akonadi.

In addition to optionally show some metadata about the affected items so the user can get an idea on how important they are to her, I think its also important to reword them so he can understand them more easily.

Of course, even better would be if akonadictl fsck could fix the issues it finds.



Some suggestions to improve the situation:

Item "1289323" has no RID. =>

Item "1289323" only stored in database, not locally in XYZ resource.

XYZ should be the name of the affected resource.


Item "1360766" has RID and is dirty. =>

Item "1360766" has changes that are only stored in the database, not locally in XYZ resource



Checking qt-kde-ml
Found duplicates 1424447870.R861.merkaba
Found duplicates 1424453278.R177.merkaba
Found duplicates 1424453279.R44.merkaba
Found duplicates 1424455081.R806.merkaba
Found duplicates 1424455081.R860.merkaba
Found duplicates 1435497488.R146.merkaba
Found duplicates 1435497488.R303.merkaba:2,S
Found duplicates 1435499290.R326.merkaba:2,S
Found duplicates 1435499291.R292.merkaba
Found duplicates 1435499291.R683.merkaba
Found duplicates 1435499291.R865.merkaba
Found duplicates 1435499291.R886.merkaba:2,S

=> Exactly tell which files are identically to give the user the chance to remove duplicate files. It is not obvious from the above:

~/.local/share/local-mail/[…]/.Debian.directory/qt-kde-ml/new> ls -l  1424447870.R861.merkaba 1424453278.R177.merkaba 1424453279.R44.merkaba 1424455081.R806.merkaba 1424455081.R806.merkaba 1424455081.R806.merkaba 1424455081.R860.merkaba 1435497488.R146.merkaba 1435497488.R303.merkaba:2,S 1435499290.R326.merkaba:2,S 1435499291.R292.merkaba 1435499291.R683.merkaba 1435499291.R865.merkaba 1435499291.R886.merkaba:2,S
-rw-r--r-- 1 martin martin 6560 Feb 20  2015 1424447870.R861.merkaba
-rw-r--r-- 1 martin martin 6085 Feb 20  2015 1424453278.R177.merkaba
-rw-r--r-- 1 martin martin 4232 Feb 20  2015 1424453279.R44.merkaba
-rw-r--r-- 1 martin martin 4175 Feb 20  2015 1424455081.R806.merkaba
-rw-r--r-- 1 martin martin 4175 Feb 20  2015 1424455081.R806.merkaba
-rw-r--r-- 1 martin martin 4175 Feb 20  2015 1424455081.R806.merkaba
-rw-r--r-- 1 martin martin 4348 Feb 20  2015 1424455081.R860.merkaba
-rw-r--r-- 1 martin martin 4309 Jun 28  2015 1435497488.R146.merkaba
-rw-r--r-- 1 martin martin 4327 Jun 28  2015 1435497488.R303.merkaba:2,S
-rw-r--r-- 1 martin martin 4351 Jun 28  2015 1435499290.R326.merkaba:2,S
-rw-r--r-- 1 martin martin 4314 Jun 28  2015 1435499291.R292.merkaba
-rw-r--r-- 1 martin martin 8027 Jun 28  2015 1435499291.R683.merkaba
-rw-r--r-- 1 martin martin 4349 Jun 28  2015 1435499291.R865.merkaba
-rw-r--r-- 1 martin martin 7910 Jun 28  2015 1435499291.R886.merkaba:2,S
Comment 1 Martin Steigerwald 2018-01-29 16:40:00 UTC
That is with Akonadi 17.08.3
Comment 2 Daniel Vrátil 2018-01-29 20:14:33 UTC
> Of course, even better would be if akonadictl fsck could fix the issues it finds.

Yes, but not everything is an issue.

> Some suggestions to improve the situation:
> Item "1289323" has no RID. =>
>
> Item "1289323" only stored in database, not locally in XYZ resource.
>
> XYZ should be the name of the affected resource.

There are two problems:
1) The item can be owned by an online Resource (e.g. IMAP) and the Item may have no RID simply because the Resource wasn't online yet. In such case the situation above is not an error.

2) Your message is not precise. "locally in XYZ resource" is wrong, again in case of online resource. I don't want fsck to contain misguiding messages. In its own way, the current message (Item has no RID) is perfect, because describes the situation perfectly. I think the best way would be to just add a link to Wiki where the issue is described in detail.

> Item "1360766" has RID and is dirty. =>
> 
> Item "1360766" has changes that are only stored in the database, not locally in XYZ resource

Again, the case can simply be because the change couldn't be replayed by an online Resource yet because it's offline. Same solution as above would apply.

> => Exactly tell which files are identically to give the user the chance to remove duplicate files. It is not obvious from the above:

The "duplicate" refers to duplicate RID, meaning there are multiple Items stored in the database that belong to the same Collection and have the same RID. That in no way implies there's a duplicate in the backing storage (maildir or IMAP) and in most cases there's not even any actual file to refer to (e.g. IMAP).
Comment 3 Martin Steigerwald 2018-01-29 20:36:06 UTC
Thank you Dan, for your prompt responses. So I still misunderstood some of the error messages despite your explainations. I think this hints and the underlying issue: there is an akonadictl fsck, it often does not really fix the issue the user sees, yet it reports some kind of probable inconsistencies and leaves the user without anything sensible to do about… So I wonder about the purpose of it? Aside from the migration work it does, is it a developer tool? Whatever, no need to discuss it here. I reported what I think I understood from your answer on the mailing list. If there are different actions to be taken than the ones I suggest, feel free to close some of the reports I made. I just wanted to help to move this forward.
Comment 4 Richard Bos 2018-08-05 09:04:01 UTC
I agree with this.

In my case I get:

# akonadictl --verbose fsck        
Looking for resources in the DB not matching a configured resource...
Looking for collections not belonging to a valid resource...
Checking collection tree consistency...
Looking for items not belonging to a valid collection...
Looking for item parts not belonging to a valid item...
Looking for item flags not belonging to a valid item...
Looking for overlapping external parts...
Verifying external parts...
Found 165 external files.
Found 165 external parts.
Found no unreferenced external files.
Checking size treshold changes...
Found 0 parts to be moved to external files
Found 0 parts to be moved to database
Looking for dirty objects...
Collection "Search" (id: 1) has no RID.
Collection "OpenInvitations" (id: 1416) has no RID.
Collection "DeclinedInvitations" (id: 1417) has no RID.
Found 3 collections without RID.
Item "243448" has no RID.
Item "243531" has no RID.
Item "243813" has no RID.
Item "243868" has no RID.
Item "243869" has no RID.
Item "243986" has no RID.
Item "244023" has no RID.
Item "244026" has no RID.
Item "244111" has no RID.
Item "244176" has no RID.
Item "244379" has no RID.
Item "244393" has no RID.
Item "244399" has no RID.
Item "319051" has no RID.
Item "328336" has no RID.
Item "328379" has no RID.
Found 16 items without RID.
Item "232461" has RID and is dirty.
Item "239145" has RID and is dirty.
Item "243363" has RID and is dirty.
Item "328334" has RID and is dirty.
Found 4 dirty items.
Looking for rid-duplicates not matching the content mime-type of the parent collection
Checking Notes
Checking Notities
Checking Search
Checking akonadi_davgroupware_resource_1
Checking akonadi_vcarddir_resource_10
Checking akonadi_vcarddir_resource_11
Checking akonadi_vcarddir_resource_12
Checking akonadi_vcarddir_resource_8
Checking akonadi_vcarddir_resource_9
Checking DeclinedInvitations
Checking OpenInvitations
Checking https://<server>/remote.php/caldav/calendars/<user>/contact_birthdays/
Checking https://<server>/remote.php/caldav/calendars/<user>/default calendar/
Checking https://<server>/remote.php/carddav/addressbooks/<user>/default/

___checking of all folders___

Migrating parts to new cache hierarchy...

CRASH of akonadi!

This happens now everytime.  It worked before.

Now how to make akonadictl fsck work again???!

BTW: creating a bug report (with stackback trace) does not work :(

Would be nice / good to give this attention at Akademy
Comment 5 Martin Steigerwald 2018-08-05 09:18:23 UTC
Dear Richard.

(In reply to Richard Bos from comment #4)
> I agree with this.

Thanks for confirming the report.

[…]
> Migrating parts to new cache hierarchy...
> 
> CRASH of akonadi!
> 
> This happens now everytime.  It worked before.
> 
> Now how to make akonadictl fsck work again???!

However please do not use this bug report to report a crash of akonadictl fsck.

Also please do not use this bug report for user support questions.

Why? Cause this was the reason for many bugs having so many comments that that they are not useful for a developer anymore. Please help to keep this bug report clean, simple and about *one* issue, please help to keep it actionable. Thank you.

Please file a different bug report.

> BTW: creating a bug report (with stackback trace) does not work :(

Then please use kdepim-users or another appropriate support channel to get help with that. Please refrain from further comments that are not related to this bugs issue.
Comment 6 Nick 2019-04-24 18:47:16 UTC
Similar issues .
akonadi fsck / vacuum reports many issues/messages but did not give to the user any hints on how to solve them .

What about having an option akonadictl fix [ noRID | dirtyRID ...etc ] ?
Such tool will reduce the number of bugs reported by akonadictl fsck 

Also, what about telling the users :
1) procedures of fixing issues 
2) procedures of getting rid on unnnecessary messages and/or
3) procedures of cleaning up the akonadi db of "dead" records
and above all
4) how the user should access akonadi mysql db which is handled [ start/stp] by akonadi .

See bug 406856  recently opened 

Thanks .
Nick
Comment 7 Martin Steigerwald 2019-04-27 08:47:25 UTC
Just to confirm that this still happens with Akonadi 5.9.3 (KDEPIM/Akonadi 18.08).

Here are some examples of an akonadictl fsck run after switching to PostgreSQL database backend just a few weeks ago:

Looking for resources in the DB not matching a configured resource...
Looking for collections not belonging to a valid resource...
Checking collection tree consistency...
Looking for items not belonging to a valid collection...
Looking for item parts not belonging to a valid item...
Looking for item flags not belonging to a valid item...
Looking for overlapping external parts...
Verifying external parts...
Found 8951 external files.
Found 2069 external parts.
Found unreferenced external file: /home/martin/.local/share/akonadi/file_db_data/27/231727_r1
Found unreferenced external file: /home/martin/.local/share/akonadi/file_db_data/37/85637_r2
Found unreferenced external file: /home/martin/.local/share/akonadi/file_db_data/69/230869_r1
Found unreferenced external file: /home/martin/.local/share/akonadi/file_db_data/44/216944_r1
[… >6000 more of those …]
Moved 6882 unreferenced files to lost+found.
Checking size treshold changes...
Found 0 parts to be moved to external files
Found 0 parts to be moved to database
Looking for dirty objects...
Collection "Search" (id: 1) has no RID.
Collection "OpenInvitations" (id: 360) has no RID.
Collection "DeclinedInvitations" (id: 361) has no RID.
Found 3 collections without RID.
Item "1144154" in collection "266" has no RID.
Item "1144155" in collection "256" has no RID.
Item "1144156" in collection "256" has no RID.
Item "1144157" in collection "256" has no RID.
Item "1144158" in collection "256" has no RID.
Item "1144159" in collection "256" has no RID.
Item "1144160" in collection "152" has no RID.
Item "1144161" in collection "180" has no RID.
Item "1144163" in collection "266" has no RID.
Item "1144164" in collection "256" has no RID.
Item "1144165" in collection "256" has no RID.
[… more of those …]
Found 43 items without RID.
Found 0 dirty items.
Looking for rid-duplicates not matching the content mime-type of the parent collection
Checking Search
Checking Notizen
[…]


Related bugs:
- Bug 406958 - Unreferenced files in file_db_data
- Bug 406087 - akonadictl fsck incorrectly reports success when file_lost+found folder is absent 
- Bug 406856 - Found 3734 items without RID.; Item "30024" has RID and is dirty.
Comment 8 Nick 2019-04-27 18:00:40 UTC
Hello Martin, 
Thank you for your email and your comments .
I agree that some emails may be lost . I took the time to try making/finding some correlations with what the user sees .
-----
First of all I converted to IMAP as per ip provider's tech-support .
So the database contains records from the "old" [maildir? ] system and the new "IMAP" system .  Also I had to move the database from my old machine to a new one [ in fact entire /home  from old was transfered to the new machine ]
So the same userid had acceess to the emails immediately after opensuse leap 15.0 was installed on the new machine ].
I do not know if these changes have any impact [ the hostname was chaged ] 
-----

Based on the way akonadictl presents its resulst it(akonadictl) is NOT a user friendly tool . On the contrary the actual reporting scares the user with those   messages "no RID" or "dirty" or "records removed from data base and moved  in lost-and-found" without at least to ask the user if [s]he wants/agrees with that move .

I looked closely to pimitemtable and it seems that the column remoteId is nullable [ i.e. accepts an undefinned/unidentified/unknowm value ]. 

I believe that the hub of all these issues is not with the database server but with akonadi because switching between MariaDB and Postgresql does not stop the isses from happening. These problems in kmail-database are inherited from the time when kmail was using sqlite .

In my humble opinion, the database server does what is told/comanded by the interface between the user and the database---that is akonadi---.  

It is strange that a such important item as remoteId was translated in a datamodel that accepts it as being nullable . 
That allows that any  event that is not controlled by the code inserts/updates a record as having remoteId  NULL, and from this all users' complaints .

This assumption does not match the reality . I would like to know how it comes that receiving an email---which has a sender and a receiver--- is recorded in the database with a remoteId of NULL value ?

I provide below some info collected from my machine before c;eaning up the records with RID NULl in pimitemtable .
You van notice that that record had a valid collectionId  but has invalid remoteId .  I checked that many other emails with valid remoteId pointer to the same collection Id .

I tend to agree with you that ..... 
``
However it also could be that Akonadi just messes up big time like in storing the items into remote storage, but somehow failing to store the remote ID or whatever. 
`` 

I believe that KDE development should have a look at kmail-database-datamodel if it matches technical-requirements  and/or ifv akonadi handles the email requests properly . 

Also it is very dificult for the user to find any item that is wrong using kmail 
and not to dig in databse tables .  That's the role of akonadictl and/or what ever any USER ORIENTED tools . I believe that akonadikconsole is too powerful for the casual user .  At the same time akonadi&kmail seem to keep their cards close to the vest ..... making easy for developers to point to user errors .

The users have had it for too long asking and asking and asking and nothing to be done .  

Thank you,
Nick
========================= info from my machine =========================
--------------
describe pimitemtable
--------------

Field	Type	Null	Key	Default	Extra
id	bigint(20)	NO	PRI	NULL	auto_increment
rev	int(11)	NO		0	
remoteId	varbinary(255)	YES	MUL	NULL	
remoteRevision	varbinary(255)	YES		NULL	
gid	varbinary(255)	YES	MUL	NULL	
collectionId	bigint(20)	YES	MUL	NULL	
mimeTypeId	bigint(20)	YES	MUL	NULL	
datetime	timestamp	NO		current_timestamp()	
atime	timestamp	NO		current_timestamp()	
dirty	tinyint(1)	YES		NULL	
size	bigint(20)	NO		0	
--------------
select count(*) from pimitemtable
--------------

count(*)
8691

Record without RID as reported by akonadictl fsck [ 38264 item ... RID ] 
38264	5	NULL	NULL	NULL	123	4	2019-04-05 01:12:16	2019-04-05 01:12:16	1	81206

collectionId 123 is one of directories under "Local Folders 



--------------
describe pimitemflagrelation
--------------

Field	Type	Null	Key	Default	Extra
PimItem_id	bigint(20)	NO	PRI	NULL	
Flag_id	bigint(20)	NO	PRI	NULL	
--------------
select count(*) from pimitemflagrelation
--------------

count(*)
13859

Records associated with id 38264 [ from pimitemtable ]
38264	1
38264	5
38264	12
38264	13

--------------
describe parttable
--------------

Field	Type	Null	Key	Default	Extra
id	bigint(20)	NO	PRI	NULL	auto_increment
pimItemId	bigint(20)	NO	MUL	NULL	
partTypeId	bigint(20)	NO	MUL	NULL	
data	longblob	YES		NULL	
datasize	bigint(20)	NO		NULL	
version	int(11)	YES		0	
storage	tinyint(4)	YES		0	
--------------
select count(*) from parttable
--------------

count(*)
30569


Records associated with id 38264 [ from pimitemtable ]

131077	38264	5	131077_r2	80258	1	1
131078	38264	6	X-Virus-Flag: no\nX-Virus-Flag: no\nFrom: ......\nTo: .......\nSubject: 
Fwd: Your ...... .	497	1	0
131079	38264	7	\0\0\0\0\0....	451	2	0
131080	38264	8	\0\0\0*\0n\0d\0o\0r\0d....	104	0	0
131081	38264	9	immediately	11	0	0
131082	38264	10	\0\0\0˰\0\0\0˰\0\0\0\0˰\0\0\0˰\02\0\0\0˰\0\0\0\0\0\0\0�X	32	0	0
131083	38264	11	moveTo5	7	0	0
131084	38264	12	97585928	8	0	0