In case the database contains bad data, the docterms codec may crash when decoding the data. The data is scanned for either a terminating '\x00' or '\x01', denoting a term or a term suffix. The suffix is concatenated with the previous term, which crashes if there was no previous term (bad access with QVector<>::last()).
Git commit e1d1b7e87ff1e8ce6a7e03ecdf2902322cb8624a by Stefan Brüns. Committed on 29/05/2018 at 23:47. Pushed by bruns into branch 'master'. Avoid crash when reading corrupt data from document terms db Summary: The terms db contains terms, where each terms is stored independently (terminated with 0), or as a suffix to the previous term (terminated with 1). In case of corrupted data, the first terminator seen may be a 1, which leads to a crash when trying to access the previous term with QVector<>::last(). Show a debug message, to give a hint about the bad data, which can be fixed by reindexing the relevant file. Related: bug 392877 Test Plan: Corrupt the database Run balooshow -x <affected file(s)> Reviewers: #baloo, michaelh, ngraham, #frameworks, dhaumann Reviewed By: dhaumann Subscribers: dhaumann, kde-frameworks-devel, #frameworks Tags: #frameworks, #baloo Differential Revision: https://phabricator.kde.org/D12047 M +5 -0 src/codecs/doctermscodec.cpp M +5 -1 src/engine/documentdb.cpp https://commits.kde.org/baloo/e1d1b7e87ff1e8ce6a7e03ecdf2902322cb8624a