Bug 421946 - Baloo indexing anomalies
Summary: Baloo indexing anomalies
Status: RESOLVED NOT A BUG
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 5.68.0
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Stefan Brüns
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-23 02:10 UTC by Scott
Modified: 2023-07-07 17:23 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Terminal output of balooctl monitor (572.99 KB, text/plain)
2020-05-23 02:10 UTC, Scott
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Scott 2020-05-23 02:10:41 UTC
Created attachment 128703 [details]
Terminal output of balooctl monitor

SUMMARY
The baloo database reports a higher number of entries than there are files on the disks.

STEPS TO REPRODUCE
1. balooctl monitor
2. balooctl purge
3. 

OBSERVED RESULT
Please see the attached output of baloo monitor showing all files indexed immediately after a baloo purge command is issued.

EXPECTED RESULT
That the result of balooctl status (5,778) would be identical to those items displayed in balooctl monitor (5,317) and the total number of files on the disks as reported by Dolphin>Properties (5316) - variance of 1 discussed below.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 
KDE Plasma Version: 5.18.5
KDE Frameworks Version: 5.68.0
Qt Version: 5.12.8
Baloo Version: 5.68.0 

ADDITIONAL INFORMATION
1/ I have performed a full reconcilation of the balooctl monitor output (attached) against the entire database and only found 1 entry which should not have been indexed - 
Indexing: /media/data/disk01/snapraid.content.lock: Ok
based on the entries in baloofilerc 
2/ In addition to the above reconciliation I have reconciled those with the Plex database used to play the files and it reconciles as well.
3/ Dolphin Properties of the entire array returns 5,316 files the above file ( /media/data/disk01/snapraid.content.lock) which is 0 bytes being the difference.
4/ I am not aware of how to reconcile what the database actually contains with anything because I cannot find a way to see the database contents, so I don't can't account for the 8% difference (462 files).
5/ The entire array, including hidden files, 11 disks, only has media files in it except for lost & found in each disk, 1 trash and every 2nd disk has a content file. So there just
are not as many files on the disk array as are reported as being indexed in the baloo database via the command balooctl status.
Comment 1 Stefan Brüns 2020-05-24 00:43:34 UTC
Dolphin counts hidden files.

Baloo does not index hidden files by default.

Its your job to find out *which* files are not indexed. Just telling the numbers are not like what you expect them to be is pointless.

*Iff* there is a file which should be indexed but is not, come back.
Comment 2 tagwerk19 2020-05-24 08:03:44 UTC
If I was to make a guess...

The result from "balooctl status" includes the number of directories looked at whereas the stream from "balooctl monitor" gives you the files opened and indexed.

Try creating a new empty folder, purge and reindex and see if the "balooctl status" goes up one.
Comment 3 Scott 2020-05-24 23:54:31 UTC
(In reply to tagwerk19 from comment #2)
> If I was to make a guess...
> 
> The result from "balooctl status" includes the number of directories looked
> at whereas the stream from "balooctl monitor" gives you the files opened and
> indexed.
> 
> Try creating a new empty folder, purge and reindex and see if the "balooctl
> status" goes up one.

That's the difference, thank you. Case closed.