| Summary: | Baloo is not able to create a reliable database | ||
|---|---|---|---|
| Product: | [Frameworks and Libraries] frameworks-baloo | Reporter: | Daniel <daniel.schoeni> |
| Component: | Baloo File Daemon | Assignee: | baloo-bugs-null |
| Status: | REPORTED --- | ||
| Severity: | normal | CC: | nicolas.fella |
| Priority: | NOR | ||
| Version First Reported In: | 5.115.0 | ||
| Target Milestone: | --- | ||
| Platform: | Ubuntu | ||
| OS: | Linux | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
Now I've found 1 File (out of over 200'000), which had a '\009' inside its name. the special character is invisible and if the file is downloaded or copied with a filemanager, it's not identifiable as a filename with a problematic character. I guess, this was the reason for the database to struggle. This should really not break the whole database, but it happened.
After a rename of the file, a database purge and a reindex, the problem has been gone, at least for the moment.
balooshow -x "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv"
1e8b31a211816 438376470 125107 /home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv
Mtime: 1748907580 2025-06-03T01:39:40
Ctime: 1748907580 2025-06-03T01:39:40
Cached properties:
Bitrate: 2194924
Dauer: 6858
Breite: 1280
Höhe: 538
Seitenverhältnis: 2.376237623762376
Bildwiederholrate: 23.976023976023978
Interne Information
Dateinamen-Begriffe: F2020 F2067 Fdie Fkampf Fmkv Fum Fzukunft
XAttr Begriffe:
Plain Text Terms:
Property Terms: Mmatroska Mvideo Mx T3 X1-2194924 X26-1280 X27-538 X28-2.376237623762376 X29-23.976023976023978 X3-6858
height: 538
width: 1280
frameRate: 23.976023976023978
aspectRatio: 2.376237623762376
bitRate: 2194924
duration: 6858
Today I've realized, that the database is corrupted again. Yesterday it was working, I did shut down the system, went to sleep, and after starting the system today, I have now wrong data on all entries, like "duration" for pictures and a Size of 75x75 for a movie, which is in fact 1280x720, but no duration. For another movie, the duration shows 0:03:18, which must be from a mp3, but not from this movie. So the cause for this corruption seems not to come from the filename with a control character inside, but from something else, which I can't find nor identify. Yes the data from the movie I took to compare is also corrupted.
balooshow -x "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv"
Die Dokument-Kennung in der Baloo-Datenbank und im Dateisystem sind verschieden:
Url: /home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv
ID: 94927656982 (DB) <-> 110802004678678 (FS)
Inode: 22 (DB) <-> 25798 (FS)
DeviceID: 438376470 (DB) == 438376470 (FS)
64c61a211816 438376470 25798 /home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv [/home/sd/Media/Musik/A/A Taste of Honey/A Taste of Honey + Twice as Sweet [Disc 2]/2-06-A Taste of Honey-Don't You Lead Me On.mp3]
Mtime: 1453754438 2016-01-25T21:40:38
Ctime: 1453754438 2016-01-25T21:40:38
Cached properties:
Bitrate: 192000
Kanäle: 2
Dauer: 198
Genre: Funk
Abtastrate: 44100
Nummer des Stücks: 6
Jahr der Veröffentlichung: 2000
Kommentar: Track 2
Interpret: A Taste of Honey
Album: A Taste of Honey + Twice as Sweet [Disc 2]
Interpret des Albums: A Taste of Honey
Titel: Don't You Lead Me On
CD-Nummer: 2
ReplayGain Album Peak: 0.456482
ReplayGain Album Gain: 1.25
ReplayGain Track Peak: 0.393982
ReplayGain Track Gain: 1.37
Interne Information
Dateinamen-Begriffe: F06 F2 Fa Fdon't Fhoney Flead Fme Fmp3 Fof Fon Ftaste Fyou
XAttr Begriffe:
Plain Text Terms: + 2 a as disc don't funk honey lead me of on sweet taste twice you
Property Terms: Maudio Mmpeg T2 X1-192000 X10-+ X10-2 X10-a X10-as X10-disc X10-honey X10-of X10-sweet X10-taste X10-twice X11-a X11-honey X11-of X11-taste X15-don't X15-lead X15-me X15-on X15-you X2-2 X3-198 X4-funk X5-44100 X6-6 X62-2 X7-2000 X74-0.456482 X75-1.25 X76-0.393982 X77-1.37 X8-2 X8-track X9-a X9-honey X9-of X9-taste
title: don't lead me on you
replayGainAlbumGain: 1.25
comment: 2 track
artist: a honey of taste
album: + 2 a as disc honey of sweet taste twice
albumArtist: a honey of taste
genre: funk
sampleRate: 44100
trackNumber: 6
releaseYear: 2000
bitRate: 192000
replayGainTrackPeak: 0.393982
channels: 2
replayGainTrackGain: 1.37
duration: 198
discNumber: 2
replayGainAlbumPeak: 0.456482
|
SUMMARY Files can be added to the index, but viewing it's data gets wrong informations. The whole database seems to be scambled up. STEPS TO REPRODUCE 1. Setup exclude folders[$e]=$HOME/ folders[$e]= index contents=false only basic indexing=false 2. Add files like: balooctl index "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv" or folders: find $DIR_NAME -type f -regex '.*\.\(mp4\|mkv\|avi\|mp3\|jpg\|jpeg\|png\|MP4\|MKV\|AVI\|MP3\|JPG\|JPEG\|PNG\)' -exec balooctl index {} + 3. Look for results, like: balooshow -x "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv" OBSERVED RESULT balooshow -x "/home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv" Die Dokument-Kennung in der Baloo-Datenbank und im Dateisystem sind verschieden: Url: /home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv ID: 878226761127958 (DB) <-> 537330911877142 (FS) Inode: 204478 (DB) <-> 125107 (FS) DeviceID: 438376470 (DB) == 438376470 (FS) 1e8b31a211816 438376470 125107 /home/sd/Media/Filme/_mkv/2067 Kampf um die Zukunft (2020)/2067 Kampf um die Zukunft (2020).mkv [/home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3] Mtime: 1516553273 2018-01-21T17:47:53 Ctime: 1516553273 2018-01-21T17:47:53 Cached properties: Bitrate: 160000 Kanäle: 2 Dauer: 295 Genre: Classical Abtastrate: 44100 Nummer des Stücks: 4 Jahr der Veröffentlichung: 1998 Kommentar: D107 Interpret: Czech Philharmonic Album: Rusalka Interpret des Albums: Vaclav Neumann Komponist: Dvořák Titel: 04 He comes here frequently CD-Nummer: 1 ReplayGain Album Peak: 0.999969 ReplayGain Album Gain: 5.38 ReplayGain Track Peak: 0.652557 ReplayGain Track Gain: 4.71 Interne Information Dateinamen-Begriffe: F04 F1 Fcomes Fczech Ffrequently Fhe Fhere Fmp3 Fphilharmonic XAttr Begriffe: Plain Text Terms: 04 classical comes czech dvorak frequently he here neumann philharmonic rusalka vaclav Property Terms: Maudio Mmpeg T2 X1-160000 X10-rusalka X11-neumann X11-vaclav X12-dvorak X15-04 X15-comes X15-frequently X15-he X15-here X2-2 X3-295 X4-classical X5-44100 X6-4 X62-1 X7-1998 X74-0.999969 X75-5.38 X76-0.652557 X77-4.71 X8-d107 X9-czech X9-philharmonic replayGainAlbumPeak: 0.999969 replayGainAlbumGain: 5.38 channels: 2 duration: 295 bitRate: 160000 trackNumber: 4 replayGainTrackPeak: 0.652557 releaseYear: 1998 replayGainTrackGain: 4.71 genre: classical sampleRate: 44100 album: rusalka albumArtist: neumann vaclav comment: d107 artist: czech philharmonic title: 04 comes frequently he here composer: dvorak discNumber: 1 Let's try with the answer from the last call: balooshow -x "/home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3" Die Dokument-Kennung in der Baloo-Datenbank und im Dateisystem sind verschieden: Url: /home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3 ID: 537330911877142 (DB) <-> 378198078593046 (FS) Inode: 125107 (DB) <-> 88056 (FS) DeviceID: 438376470 (DB) == 438376470 (FS) 157f81a211816 438376470 88056 /home/sd/Media/Musik/_Classic/Vaclav Neumann/Rusalka/1-04-Czech Philharmonic-04 He comes here frequently.mp3 [/home/sd/Media/Musik/M/Meg Myers/Take Me to the Disco/10 Little Black Death (Meg Myers).mp3] Mtime: 1538768000 2018-10-05T21:33:20 Ctime: 1538768000 2018-10-05T21:33:20 Cached properties: Bitrate: 320000 Kanäle: 2 Dauer: 242 Genre: Rock Abtastrate: 44100 Nummer des Stücks: 10 Jahr der Veröffentlichung: 2018 Interpret: Meg Myers Album: Take Me to the Disco Titel: Little Black Death Copyright: 2018 300 Entertainment Herausgeber: 300 Entertainment Beschriftung: 300 Entertainment ReplayGain Album Peak: 1.097838 ReplayGain Album Gain: -11.63 ReplayGain Track Peak: 1.079415 ReplayGain Track Gain: -11.13 Interne Information Dateinamen-Begriffe: F10 Fblack Fdeath Flittle Fmeg Fmp3 Fmyers XAttr Begriffe: Plain Text Terms: 300 black death disco entertainment little me meg myers rock take the to Property Terms: Maudio Mmpeg T2 X1-320000 X10-disco X10-me X10-take X10-the X10-to X15-black X15-death X15-little X2-2 X22-2018 X22-300 X22-entertainment X23-300 X23-entertainment X3-242 X4-rock X5-44100 X6-10 X69-300 X69-entertainment X7-2018 X74-1.097838 X75-11.63 X76-1.079415 X77-11.13 X9-meg X9-myers publisher: 300 entertainment trackNumber: 10 replayGainAlbumPeak: 1.097838 releaseYear: 2018 replayGainAlbumGain: 11.63 genre: rock sampleRate: 44100 album: disco me take the to replayGainTrackPeak: 1.079415 replayGainTrackGain: 11.13 artist: meg myers title: black death little channels: 2 duration: 242 bitRate: 320000 copyright: 2018 300 entertainment label: 300 entertainment balooctl status Die Baloo-Dateiindizierung läuft Indizierungsstatus: Inaktiv Gesamtzahl der indizierten Dateien: 211.871 Dateien, die noch indiziert werden: 0 Dateien, deren Indizierung fehlgeschlagen ist: 0 Der aktuelle Index hat eine Größe von 338,82 MiB EXPECTED RESULT The command balooshow -x "file" should give the information for this specific file, not for any other. SOFTWARE/OS VERSIONS Operating System: Ubuntu Studio 24.04 KDE Plasma Version: 5.27.12 KDE Frameworks Version: 5.115.0 Qt Version: 5.15.13 Kernel Version: 6.14.0-37-generic (64-bit) Graphics Platform: X11 Processors: 16 × 13th Gen Intel® Core™ i7-13700K Memory: 62.5 GiB of RAM Graphics Processor: NVIDIA GeForce RTX 4060 Ti/PCIe/SSE2 Manufacturer: ASUS ADDITIONAL INFORMATION I would expect, that the data for any file is correct. If there would be a problem reading a file, it simply should not create an index entry. I looks, like it could be the problem, if a file has bad or missing data, the indexer keeps the entry open and fills in the data of the next file. Maybe it's also something else, but the result is however unusable. I need the index information, to get values for duration, width and hight to compare media files in dolphin. With the actual state of the data, it is impossible to do that. So this is a minus point for Linux, because I dont't have any problem to get this data on the Windows Explorer. I'm trying at the moment to change from Windows to Linux, but faced with such problems, I can't really do it.