Bug 515572 - Baloo is not able to create a reliable database
Summary: Baloo is not able to create a reliable database
Status: REPORTED
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: Baloo File Daemon (other bugs)
Version First Reported In: 5.115.0
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: baloo-bugs-null
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2026-02-05 20:17 UTC by Daniel
Modified: 2026-02-06 11:35 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel 2026-02-05 20:17:23 UTC
SUMMARY
Files can be added to the index, but viewing it's data gets wrong informations. The whole database seems to be scambled up.

STEPS TO REPRODUCE
1. Setup 
    exclude folders[$e]=$HOME/
    folders[$e]=
    index contents=false
    only basic indexing=false

2. Add files like:
    balooctl index "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv"
    or folders:
    find $DIR_NAME -type f -regex '.*\.\(mp4\|mkv\|avi\|mp3\|jpg\|jpeg\|png\|MP4\|MKV\|AVI\|MP3\|JPG\|JPEG\|PNG\)' -exec balooctl index {} +

3. Look for results, like:
    balooshow -x "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv"

OBSERVED RESULT
balooshow -x "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv"

Die Dokument-Kennung in der Baloo-Datenbank und im Dateisystem sind verschieden:
Url: /home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv
ID:       878226761127958 (DB) <-> 537330911877142 (FS)
Inode:    204478 (DB) <-> 125107 (FS)
DeviceID: 438376470 (DB) == 438376470 (FS)
1e8b31a211816 438376470 125107 /home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv [/home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3]
        Mtime: 1516553273 2018-01-21T17:47:53
        Ctime: 1516553273 2018-01-21T17:47:53
        Cached properties:
                Bitrate: 160000
                Kanäle: 2
                Dauer: 295
                Genre: Classical
                Abtastrate: 44100
                Nummer des Stücks: 4
                Jahr der Veröffentlichung: 1998
                Kommentar: D107
                Interpret: Czech Philharmonic
                Album: Rusalka
                Interpret des Albums: Vaclav Neumann
                Komponist: Dvořák
                Titel: 04 He comes here frequently
                CD-Nummer: 1
                ReplayGain Album Peak: 0.999969
                ReplayGain Album Gain: 5.38
                ReplayGain Track Peak: 0.652557
                ReplayGain Track Gain: 4.71

Interne Information
Dateinamen-Begriffe: F04 F1 Fcomes Fczech Ffrequently Fhe Fhere Fmp3 Fphilharmonic 
XAttr Begriffe: 
Plain Text Terms: 04 classical comes czech dvorak frequently he here neumann philharmonic rusalka vaclav 
Property Terms: Maudio Mmpeg T2 X1-160000 X10-rusalka X11-neumann X11-vaclav X12-dvorak X15-04 X15-comes X15-frequently X15-he X15-here X2-2 X3-295 X4-classical X5-44100 X6-4 X62-1 X7-1998 X74-0.999969 X75-5.38 X76-0.652557 X77-4.71 X8-d107 X9-czech X9-philharmonic 
replayGainAlbumPeak: 0.999969
replayGainAlbumGain: 5.38
channels: 2
duration: 295
bitRate: 160000
trackNumber: 4
replayGainTrackPeak: 0.652557
releaseYear: 1998
replayGainTrackGain: 4.71
genre: classical
sampleRate: 44100
album: rusalka
albumArtist: neumann vaclav
comment: d107
artist: czech philharmonic
title: 04 comes frequently he here
composer: dvorak
discNumber: 1

Let's try with the answer from the last call:
balooshow -x "/home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3"

Die Dokument-Kennung in der Baloo-Datenbank und im Dateisystem sind verschieden:
Url: /home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3
ID:       537330911877142 (DB) <-> 378198078593046 (FS)
Inode:    125107 (DB) <-> 88056 (FS)
DeviceID: 438376470 (DB) == 438376470 (FS)
157f81a211816 438376470 88056 /home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3 [/home/sd/Media/Musik/M/Meg Myers/Take Me to the Disco/10 Little Black Death (Meg Myers).mp3]
        Mtime: 1538768000 2018-10-05T21:33:20
        Ctime: 1538768000 2018-10-05T21:33:20
        Cached properties:
                Bitrate: 320000
                Kanäle: 2
                Dauer: 242
                Genre: Rock
                Abtastrate: 44100
                Nummer des Stücks: 10
                Jahr der Veröffentlichung: 2018
                Interpret: Meg Myers
                Album: Take Me to the Disco
                Titel: Little Black Death
                Copyright: 2018 300 Entertainment
                Herausgeber: 300 Entertainment
                Beschriftung: 300 Entertainment
                ReplayGain Album Peak: 1.097838
                ReplayGain Album Gain: -11.63
                ReplayGain Track Peak: 1.079415
                ReplayGain Track Gain: -11.13

Interne Information
Dateinamen-Begriffe: F10 Fblack Fdeath Flittle Fmeg Fmp3 Fmyers 
XAttr Begriffe: 
Plain Text Terms: 300 black death disco entertainment little me meg myers rock take the to 
Property Terms: Maudio Mmpeg T2 X1-320000 X10-disco X10-me X10-take X10-the X10-to X15-black X15-death X15-little X2-2 X22-2018 X22-300 X22-entertainment X23-300 X23-entertainment X3-242 X4-rock X5-44100 X6-10 X69-300 X69-entertainment X7-2018 X74-1.097838 X75-11.63 X76-1.079415 X77-11.13 X9-meg X9-myers 
publisher: 300 entertainment
trackNumber: 10
replayGainAlbumPeak: 1.097838
releaseYear: 2018
replayGainAlbumGain: 11.63
genre: rock
sampleRate: 44100
album: disco me take the to
replayGainTrackPeak: 1.079415
replayGainTrackGain: 11.13
artist: meg myers
title: black death little
channels: 2
duration: 242
bitRate: 320000
copyright: 2018 300 entertainment
label: 300 entertainment

balooctl status

Die Baloo-Dateiindizierung läuft
Indizierungsstatus: Inaktiv
Gesamtzahl der indizierten Dateien: 211.871
Dateien, die noch indiziert werden: 0
Dateien, deren Indizierung fehlgeschlagen ist: 0
Der aktuelle Index hat eine Größe von 338,82 MiB


EXPECTED RESULT
The command balooshow -x "file" should give the information for this specific file, not for any other.

SOFTWARE/OS VERSIONS
Operating System: Ubuntu Studio 24.04
KDE Plasma Version: 5.27.12
KDE Frameworks Version: 5.115.0
Qt Version: 5.15.13
Kernel Version: 6.14.0-37-generic (64-bit)
Graphics Platform: X11
Processors: 16 × 13th Gen Intel® Core™ i7-13700K
Memory: 62.5 GiB of RAM
Graphics Processor: NVIDIA GeForce RTX 4060 Ti/PCIe/SSE2
Manufacturer: ASUS

ADDITIONAL INFORMATION

I would expect, that the data for any file is correct. If there would be a problem reading a file, it  simply should not create an index entry. I looks, like it could be the problem, if a file has bad or missing data, the indexer keeps the entry open and fills in the data of the next file. Maybe it's also something else, but the result is however unusable.

I need the index information, to get values for duration, width and hight to compare media files in dolphin. With the actual state of the data, it is impossible to do that. So this is a minus point for Linux, because I dont't have any problem to get this data on the Windows Explorer. I'm trying at the moment to change from Windows to Linux, but faced with such problems, I can't really do it.
Comment 1 Daniel 2026-02-06 01:03:38 UTC
Now I've found 1 File (out of over 200'000), which had a '\009' inside its name. the special character is invisible and if the file is downloaded or copied with a filemanager, it's not identifiable as a filename with a problematic character. I guess, this was the reason for the database to struggle. This should really not break the whole database, but it happened.

After a rename of the file, a database purge and a reindex, the problem has been gone, at least for the moment.

balooshow -x "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv"
1e8b31a211816 438376470 125107 /home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv
        Mtime: 1748907580 2025-06-03T01:39:40
        Ctime: 1748907580 2025-06-03T01:39:40
        Cached properties:
                Bitrate: 2194924
                Dauer: 6858
                Breite: 1280
                Höhe: 538
                Seitenverhältnis: 2.376237623762376
                Bildwiederholrate: 23.976023976023978

Interne Information
Dateinamen-Begriffe: F2020 F2067 Fdie Fkampf Fmkv Fum Fzukunft 
XAttr Begriffe: 
Plain Text Terms: 
Property Terms: Mmatroska Mvideo Mx T3 X1-2194924 X26-1280 X27-538 X28-2.376237623762376 X29-23.976023976023978 X3-6858 
height: 538
width: 1280
frameRate: 23.976023976023978
aspectRatio: 2.376237623762376
bitRate: 2194924
duration: 6858