Summary: | Baloo_file_extractor Crashes in KFileMetaData::TagLibExtractor::extract() on XML files with the .spx extension | ||
---|---|---|---|
Product: | [Frameworks and Libraries] frameworks-kfilemetadata | Reporter: | Laura David Hurka <laura.stern> |
Component: | general | Assignee: | Pinak Ahuja <pinak.ahuja> |
Status: | RESOLVED FIXED | ||
Severity: | crash | CC: | a.stippich, armandogarciasf, asturm, bruno, lagerimsi, laura.stern, nate, stefan.bruens, stream009 |
Priority: | VHI | Keywords: | drkonqi |
Version: | 5.54.0 | ||
Target Milestone: | --- | ||
Platform: | Ubuntu | ||
OS: | Linux | ||
Latest Commit: | https://commits.kde.org/kfilemetadata/61b1916c3e87c3b8f4fc3d1f1d19bf427b9247da | Version Fixed In: | 5.57 |
Sentry Crash Report: | |||
Attachments: | baloo_file_extractor-20190429-184351.kcrash.txt |
Description
Laura David Hurka
2019-02-03 22:06:50 UTC
I can reproduce the crash in the KFileMetaData extractor. I think I tracked it down to a taglib bug https://github.com/taglib/taglib/issues/836. Luckily, I am planning to port away from the buggy function anyways, eventually not causing the crash anymore. *** Bug 403710 has been marked as a duplicate of this bug. *** Got another report in Bug 403710. Looks like XML files that have the .spx extension are a reproducible cause of this crash. I am now ready to confirm that the .spx file causes the crash. Meanwhile I got other .spx files, and since I did the following command, baloo does not crash anymore. balooctl config add excludeFilters *.spx *** Bug 404095 has been marked as a duplicate of this bug. *** *** Bug 404095 has been marked as a duplicate of this bug. *** Git commit 7415aa60d9f65c2eae10094fd3fff8327f6f11ce by Alexander Stippich. Committed on 09/02/2019 at 14:48. Pushed by astippich into branch 'master'. Use content to determine mime type Summary: Determine the mime type for the extractors based on the content, not on the file extension. This avoids feeding files with a wrong or the same file extension into the wrong extractor. Reviewers: ngraham, bruns Reviewed By: ngraham Subscribers: kde-frameworks-devel, #baloo Tags: #frameworks, #baloo Differential Revision: https://phabricator.kde.org/D18819 M +1 -1 src/file/extractor/app.cpp https://commits.kde.org/baloo/7415aa60d9f65c2eae10094fd3fff8327f6f11ce The above commit should make Baloo shop crashing because it removes these files from indexing consideration. The bug is still open because this doesn;t actually fix the crash itself, it just avoids it. But from a user perspective, there shouldn't be any more crashes on .spx files starting in KDE Frameworks 5.56. *** Bug 404077 has been marked as a duplicate of this bug. *** I have created sum[i=1; 4](26^i) = 475254 files with the same content as the problematic .spx file. The file indexer crashes only at .spx (although *.spx is in the excludeFilter) . $ balooctl index Test/* [...] [...].spx Segmentation fault This was not really useful, but now I can say that other files than .spx with less than five letters in the extension do not cause crashes. Qt Version: 5.11.2 Frameworks Version: 5.54.0 Operating System: Linux 4.15.0-45-generic x86_64 Distribution: KDE neon User Edition 5.14 *** Bug 404420 has been marked as a duplicate of this bug. *** Git commit 649555ee31820af01869c7bfe8c1e96e5a9abb37 by Alexander Stippich. Committed on 10/03/2019 at 15:02. Pushed by astippich into branch 'master'. Rewrite the taglib extractor to use the generic PropertyMap interface Summary: Rewrite the taglib extractor to use taglib's PropertyMap. Since this largely unifies the handling of the different tag formats, but not quite, a lot of code is removed. The resulting code is also faster. Additionally, this avoids the usage of a FileRef object, which fixes a potential crash due to a known bug in taglib. Test Plan: all tests pass Reviewers: ngraham, bruns, mgallien Reviewed By: bruns Subscribers: smithjd, kde-frameworks-devel, #baloo Tags: #frameworks, #baloo Differential Revision: https://phabricator.kde.org/D18826 M +0 -2 autotests/taglibextractortest.cpp M +273 -880 src/extractors/taglibextractor.cpp M +6 -40 src/extractors/taglibextractor.h https://commits.kde.org/kfilemetadata/649555ee31820af01869c7bfe8c1e96e5a9abb37 It seems fix for this problem have caused regression. With file extension, matroska containers (mkv, mka etc) are recognized as video/x-matroska, audio-xmatroska so on. But with contents signature, they are recognized as super category application/x-matroska. By determine mime type from content, matroska containers are all recognized as application/x-matroska, so BasicIndexingJob can't determine correct file type (typesForMimeType() in file/basicindexingjob.cpp). As consequence it doesn't index them at all. Please report as a new bug. :) Git commit 50a91ff610379c471cea7a8f2aa4d2ea42fa5494 by Stefan Brüns. Committed on 19/03/2019 at 00:03. Pushed by bruns into branch 'master'. [ffmpegextractor] Add Matroska Video test case Summary: The test file was generated by converting the webm video file, using: $> ffmpeg -i test.webm -acodec copy -vcodec copy test.mkv Depends on D19845 Test Plan: ctest Reviewers: #baloo, #frameworks, astippich, mgallien, ngraham Reviewed By: #baloo, ngraham Subscribers: kde-frameworks-devel Tags: #frameworks, #baloo Differential Revision: https://phabricator.kde.org/D19846 M +5 -1 autotests/ffmpegextractortest.cpp A +- -- autotests/samplefiles/test.mkv https://commits.kde.org/kfilemetadata/50a91ff610379c471cea7a8f2aa4d2ea42fa5494 Git commit 69c25514cf6a08ceaaacbc4092cc02ff40853228 by Stefan Brüns. Committed on 27/03/2019 at 01:48. Pushed by bruns into branch 'master'. Add helper function to determine mime type based on content and extension Summary: The QMimeDatabase::MatchDefault only falls back to content matching if the extension is not known. This fails for e.g. Matroska files, where the content allows to distinguish between audio and video files. Reviewers: #baloo, #frameworks, astippich, ngraham, poboiko Reviewed By: #baloo, astippich, ngraham Subscribers: kde-frameworks-devel Tags: #frameworks, #baloo Differential Revision: https://phabricator.kde.org/D20045 M +1 -0 src/CMakeLists.txt A +50 -0 src/mimeutils.cpp [License: LGPL (v2.1+)] A +55 -0 src/mimeutils.h [License: LGPL (v2.1+)] https://commits.kde.org/kfilemetadata/69c25514cf6a08ceaaacbc4092cc02ff40853228 Git commit a256687a1d1150341b82cfa17218b12a944cda50 by Alexander Stippich. Committed on 30/03/2019 at 08:28. Pushed by astippich into branch 'master'. Be more precise with mimetype detection Summary: Use the new mime type helper from KFileMetaData Reviewers: #baloo, bruns Reviewed By: #baloo, bruns Subscribers: kde-frameworks-devel Tags: #frameworks, #baloo Differential Revision: https://phabricator.kde.org/D20011 M +2 -1 src/file/extractor/app.cpp https://commits.kde.org/baloo/a256687a1d1150341b82cfa17218b12a944cda50 Unfortunately not fixed for me with KF 5.57.0. (In reply to andreas.sturmlechner from comment #18) > Unfortunately not fixed for me with KF 5.57.0. *What* is not fixed for you? x If you're using 5.57 and still see crashes, please include a new backtrace and attach the guilty .spx file. Created attachment 119733 [details]
baloo_file_extractor-20190429-184351.kcrash.txt
Crash happens with any of the .spx files from here: https://github.com/qgis/QGIS/tree/master/tests/testdata/test_gdb.gdb Please file a new bug report - this is a binary file, not an XML file. Interesting, a new way for .spx files to crash Baloo. :p Git commit 61b1916c3e87c3b8f4fc3d1f1d19bf427b9247da by Stefan Brüns. Committed on 30/04/2019 at 17:24. Pushed by bruns into branch 'master'. [TagLibExtractor] Fix crash on invalid Speex files Summary: TagLib::Ogg::Speex::File::isValid() returns true even for invalid files, but tag() only returns a valid XiphComment when the file is valid. Other TagLib::Ogg::* classes properly clear the valid flag when encountering files. See https://github.com/taglib/taglib/issues/902 Reviewers: #baloo, #frameworks, ngraham, astippich Reviewed By: #baloo, ngraham, astippich Subscribers: kde-frameworks-devel Tags: #frameworks, #baloo Differential Revision: https://phabricator.kde.org/D20913 M +3 -1 src/extractors/taglibextractor.cpp https://commits.kde.org/kfilemetadata/61b1916c3e87c3b8f4fc3d1f1d19bf427b9247da |