SUMMARY Files can be added to the index, but viewing it's data gets wrong informations. The whole database seems to be scambled up. STEPS TO REPRODUCE 1. Setup exclude folders[$e]=$HOME/ folders[$e]= index contents=false only basic indexing=false 2. Add files like: balooctl index "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv" or folders: find $DIR_NAME -type f -regex '.*\.\(mp4\|mkv\|avi\|mp3\|jpg\|jpeg\|png\|MP4\|MKV\|AVI\|MP3\|JPG\|JPEG\|PNG\)' -exec balooctl index {} + 3. Look for results, like: balooshow -x "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv" OBSERVED RESULT balooshow -x "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv" Die Dokument-Kennung in der Baloo-Datenbank und im Dateisystem sind verschieden: Url: /home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv ID: 878226761127958 (DB) <-> 537330911877142 (FS) Inode: 204478 (DB) <-> 125107 (FS) DeviceID: 438376470 (DB) == 438376470 (FS) 1e8b31a211816 438376470 125107 /home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv [/home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3] Mtime: 1516553273 2018-01-21T17:47:53 Ctime: 1516553273 2018-01-21T17:47:53 Cached properties: Bitrate: 160000 Kanäle: 2 Dauer: 295 Genre: Classical Abtastrate: 44100 Nummer des Stücks: 4 Jahr der Veröffentlichung: 1998 Kommentar: D107 Interpret: Czech Philharmonic Album: Rusalka Interpret des Albums: Vaclav Neumann Komponist: Dvořák Titel: 04 He comes here frequently CD-Nummer: 1 ReplayGain Album Peak: 0.999969 ReplayGain Album Gain: 5.38 ReplayGain Track Peak: 0.652557 ReplayGain Track Gain: 4.71 Interne Information Dateinamen-Begriffe: F04 F1 Fcomes Fczech Ffrequently Fhe Fhere Fmp3 Fphilharmonic XAttr Begriffe: Plain Text Terms: 04 classical comes czech dvorak frequently he here neumann philharmonic rusalka vaclav Property Terms: Maudio Mmpeg T2 X1-160000 X10-rusalka X11-neumann X11-vaclav X12-dvorak X15-04 X15-comes X15-frequently X15-he X15-here X2-2 X3-295 X4-classical X5-44100 X6-4 X62-1 X7-1998 X74-0.999969 X75-5.38 X76-0.652557 X77-4.71 X8-d107 X9-czech X9-philharmonic replayGainAlbumPeak: 0.999969 replayGainAlbumGain: 5.38 channels: 2 duration: 295 bitRate: 160000 trackNumber: 4 replayGainTrackPeak: 0.652557 releaseYear: 1998 replayGainTrackGain: 4.71 genre: classical sampleRate: 44100 album: rusalka albumArtist: neumann vaclav comment: d107 artist: czech philharmonic title: 04 comes frequently he here composer: dvorak discNumber: 1 Let's try with the answer from the last call: balooshow -x "/home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3" Die Dokument-Kennung in der Baloo-Datenbank und im Dateisystem sind verschieden: Url: /home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3 ID: 537330911877142 (DB) <-> 378198078593046 (FS) Inode: 125107 (DB) <-> 88056 (FS) DeviceID: 438376470 (DB) == 438376470 (FS) 157f81a211816 438376470 88056 /home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3 [/home/sd/Media/Musik/M/Meg Myers/Take Me to the Disco/10 Little Black Death (Meg Myers).mp3] Mtime: 1538768000 2018-10-05T21:33:20 Ctime: 1538768000 2018-10-05T21:33:20 Cached properties: Bitrate: 320000 Kanäle: 2 Dauer: 242 Genre: Rock Abtastrate: 44100 Nummer des Stücks: 10 Jahr der Veröffentlichung: 2018 Interpret: Meg Myers Album: Take Me to the Disco Titel: Little Black Death Copyright: 2018 300 Entertainment Herausgeber: 300 Entertainment Beschriftung: 300 Entertainment ReplayGain Album Peak: 1.097838 ReplayGain Album Gain: -11.63 ReplayGain Track Peak: 1.079415 ReplayGain Track Gain: -11.13 Interne Information Dateinamen-Begriffe: F10 Fblack Fdeath Flittle Fmeg Fmp3 Fmyers XAttr Begriffe: Plain Text Terms: 300 black death disco entertainment little me meg myers rock take the to Property Terms: Maudio Mmpeg T2 X1-320000 X10-disco X10-me X10-take X10-the X10-to X15-black X15-death X15-little X2-2 X22-2018 X22-300 X22-entertainment X23-300 X23-entertainment X3-242 X4-rock X5-44100 X6-10 X69-300 X69-entertainment X7-2018 X74-1.097838 X75-11.63 X76-1.079415 X77-11.13 X9-meg X9-myers publisher: 300 entertainment trackNumber: 10 replayGainAlbumPeak: 1.097838 releaseYear: 2018 replayGainAlbumGain: 11.63 genre: rock sampleRate: 44100 album: disco me take the to replayGainTrackPeak: 1.079415 replayGainTrackGain: 11.13 artist: meg myers title: black death little channels: 2 duration: 242 bitRate: 320000 copyright: 2018 300 entertainment label: 300 entertainment balooctl status Die Baloo-Dateiindizierung läuft Indizierungsstatus: Inaktiv Gesamtzahl der indizierten Dateien: 211.871 Dateien, die noch indiziert werden: 0 Dateien, deren Indizierung fehlgeschlagen ist: 0 Der aktuelle Index hat eine Größe von 338,82 MiB EXPECTED RESULT The command balooshow -x "file" should give the information for this specific file, not for any other. SOFTWARE/OS VERSIONS Operating System: Ubuntu Studio 24.04 KDE Plasma Version: 5.27.12 KDE Frameworks Version: 5.115.0 Qt Version: 5.15.13 Kernel Version: 6.14.0-37-generic (64-bit) Graphics Platform: X11 Processors: 16 × 13th Gen Intel® Core™ i7-13700K Memory: 62.5 GiB of RAM Graphics Processor: NVIDIA GeForce RTX 4060 Ti/PCIe/SSE2 Manufacturer: ASUS ADDITIONAL INFORMATION I would expect, that the data for any file is correct. If there would be a problem reading a file, it simply should not create an index entry. I looks, like it could be the problem, if a file has bad or missing data, the indexer keeps the entry open and fills in the data of the next file. Maybe it's also something else, but the result is however unusable. I need the index information, to get values for duration, width and hight to compare media files in dolphin. With the actual state of the data, it is impossible to do that. So this is a minus point for Linux, because I dont't have any problem to get this data on the Windows Explorer. I'm trying at the moment to change from Windows to Linux, but faced with such problems, I can't really do it.
Now I've found 1 File (out of over 200'000), which had a '\009' inside its name. the special character is invisible and if the file is downloaded or copied with a filemanager, it's not identifiable as a filename with a problematic character. I guess, this was the reason for the database to struggle. This should really not break the whole database, but it happened. After a rename of the file, a database purge and a reindex, the problem has been gone, at least for the moment. balooshow -x "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv" 1e8b31a211816 438376470 125107 /home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv Mtime: 1748907580 2025-06-03T01:39:40 Ctime: 1748907580 2025-06-03T01:39:40 Cached properties: Bitrate: 2194924 Dauer: 6858 Breite: 1280 Höhe: 538 Seitenverhältnis: 2.376237623762376 Bildwiederholrate: 23.976023976023978 Interne Information Dateinamen-Begriffe: F2020 F2067 Fdie Fkampf Fmkv Fum Fzukunft XAttr Begriffe: Plain Text Terms: Property Terms: Mmatroska Mvideo Mx T3 X1-2194924 X26-1280 X27-538 X28-2.376237623762376 X29-23.976023976023978 X3-6858 height: 538 width: 1280 frameRate: 23.976023976023978 aspectRatio: 2.376237623762376 bitRate: 2194924 duration: 6858