Bug 392878

Summary: baloo crashes when reading corrupted data from the Terms db
Product: [Frameworks and Libraries] frameworks-baloo Reporter: Stefan Brüns <stefan.bruens>
Component: EngineAssignee: baloo-bugs-null
Status: RESOLVED FIXED    
Severity: normal CC: nate
Priority: NOR    
Version: 5.44.0   
Target Milestone: ---   
Platform: Other   
OS: Linux   
See Also: https://bugs.kde.org/show_bug.cgi?id=392877
Latest Commit: Version Fixed In:

Description Stefan Brüns 2018-04-08 14:13:13 UTC
In case the database contains bad data, the docterms codec may crash when decoding the data.

The data is scanned for either a terminating '\x00' or '\x01', denoting a term or a term suffix. The suffix is concatenated with the previous term, which crashes if there was no previous term (bad access with QVector<>::last()).
Comment 1 Stefan Brüns 2018-05-29 23:47:47 UTC
Git commit e1d1b7e87ff1e8ce6a7e03ecdf2902322cb8624a by Stefan Brüns.
Committed on 29/05/2018 at 23:47.
Pushed by bruns into branch 'master'.

Avoid crash when reading corrupt data from document terms db

Summary:
The terms db contains terms, where each terms is stored independently
(terminated with 0), or as a suffix to the previous term (terminated with
1).
In case of corrupted data, the first terminator seen may be a 1, which
leads to a crash when trying to access the previous term with
QVector<>::last().
Show a debug message, to give a hint about the bad data, which can be
fixed by reindexing the relevant file.
Related: bug 392877

Test Plan:
Corrupt the database
Run balooshow -x <affected file(s)>

Reviewers: #baloo, michaelh, ngraham, #frameworks, dhaumann

Reviewed By: dhaumann

Subscribers: dhaumann, kde-frameworks-devel, #frameworks

Tags: #frameworks, #baloo

Differential Revision: https://phabricator.kde.org/D12047

M  +5    -0    src/codecs/doctermscodec.cpp
M  +5    -1    src/engine/documentdb.cpp

https://commits.kde.org/baloo/e1d1b7e87ff1e8ce6a7e03ecdf2902322cb8624a