Bug 468337

Summary: Baloo ignores corrupted DB
Product: [Frameworks and Libraries] frameworks-baloo Reporter: Elias Probst <mail>
Component: EngineAssignee: baloo-bugs-null
Status: RESOLVED WORKSFORME    
Severity: normal CC: tagwerk19
Priority: NOR    
Version First Reported In: 5.104.0   
Target Milestone: ---   
Platform: NixOS   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description Elias Probst 2023-04-09 20:42:47 UTC
SUMMARY
For reasons unknown, my Baloo DB got corrupted - but I had no clue until I wondered why I got no results from Baloo and started to investigate using Baloo's CLI tools.
At no point did Baloo's KCM or any of the Plasma tooling let me know, that Baloo might have a problem with its DB.


STEPS TO REPRODUCE
1. Corrupt your DB (procedure unknown)
2. Index a file
3. Execute a query which should return the file

OBSERVED RESULT
- Baloo acts on the surface, as if nothing happened
- Baloo only hints at a problem, when manually executing "balooctl index $PathToFile"

The following steps demonstrate the problem:
- Create a testfile:
  ```
  echo "This is my testfile. Unique term is: joo0theg8Vai2agh8wah" > ~/Documents/my-testfile.txt
  ```
- Query Baloo for the unique term in it (no results returned):
  ```
  baloosearch joo0theg8Vai2agh8wah
  Elapsed: 1.47841 msecs
  ```
- Ask Baloo explicitly to index the file (it sends mixed signals - on the one hand it claims to have indexed the file, on the other hand a huge number of errors):
  ```
  balooctl index ~/Documents/my-testfile.txt
  kf.baloo.engine: IdTreeDB::put MDB_CORRUPTED: Located page was wrong type
  kf.baloo.engine: IdFilenameDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: IdTreeDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: IdFilenameDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: IdTreeDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: IdFilenameDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: IdTreeDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: IdFilenameDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: DocumentDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: DocumentDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: DocumentTimeDB::put 25335805920738816 MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: MTimeDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: DocumentDataDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PostingDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: PositionDB::put MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  kf.baloo.engine: Transaction::commit MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
  Indexing /home/eliasp/Documents/my-testfile.txt
  File(s) indexed
  ```
- Query Baloo again (still no results):
  ```
  baloosearch joo0theg8Vai2agh8wah
  Elapsed: 0.185144 msecs
  ```

EXPECTED RESULT
- Perfect: don't corrupt the DB in the first place
- Nice: fix/remove corrupted data in the DB automatically
- OK: let the user know (preferrably via the KCM _and_ a Plasma notification) the DB got corrupted and ask, whether to remove/recreate it

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: NixOS/nixpkgs unstable 19cf008bb1
KDE Plasma Version: 5.27.3
KDE Frameworks Version: 5.104.0
Qt Version: 5.15.8
LMDB 0.9.30
Comment 1 tagwerk19 2023-04-21 10:26:26 UTC
(In reply to Elias Probst from comment #0)
> 1. Corrupt your DB (procedure unknown)
Chuckle :-)

The most recent option for doing this related to non-printable characters (things like \000 and \001) in the file. See Bug 464226 and

    https://invent.kde.org/frameworks/baloo/-/merge_requests/87

This, at least, should be sorted in Frameworks 5.105

> ... let the user know ... the DB got corrupted ...
There's a database consistency check script (by Igor Poboiko) here

    https://invent.kde.org/frameworks/baloo/uploads/bdc9f5f17fc96490b7bd4a22ac664843/baloo-checkdb.py

that he refers to in:

    https://invent.kde.org/frameworks/baloo/-/merge_requests/87#note_535270

That might be worth a try. Heads up about the need to load the whole database into memory

If you've got a corruption, probably your only option is to purge and reindex.

> SOFTWARE/OS VERSIONS
> Linux/KDE Plasma: NixOS/nixpkgs unstable 19cf008bb1
That may be an interesting journey... I got there with:

    environment.systemPackages = with pkgs; [
       (python3.withPackages(ps: with ps; [ lmdb ]))
    ];

and then

    $ python3 baloo-checkdb.py

But whether this works in a general case...
Comment 2 tagwerk19 2023-04-25 22:02:23 UTC
Did you get anywhere?
Comment 3 Bug Janitor Service 2023-05-10 03:46:04 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 4 Bug Janitor Service 2023-05-25 03:45:49 UTC
This bug has been in NEEDSINFO status with no change for at least
30 days. The bug is now closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

Thank you for helping us make KDE software even better for everyone!