Bug 475730 - Baloo crashes in KFileMetaData::MobiExtractor::extract
Summary: Baloo crashes in KFileMetaData::MobiExtractor::extract
Status: RESOLVED DUPLICATE of bug 475975
Alias: None
Product: frameworks-kfilemetadata
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 5.110.0
Platform: Fedora RPMs Linux
: NOR crash
Target Milestone: ---
Assignee: Pinak Ahuja
URL:
Keywords: drkonqi
Depends on:
Blocks:
 
Reported: 2023-10-17 07:12 UTC by harveyrasp
Modified: 2023-11-10 14:18 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description harveyrasp 2023-10-17 07:12:00 UTC
Application: baloo_file_extractor (5.110.0)

Qt Version: 5.15.10
Frameworks Version: 5.110.0
Operating System: Linux 6.5.6-200.fc38.x86_64 x86_64
Windowing System: X11
Distribution: Fedora Linux 38 (KDE Plasma)
DrKonqi: 5.27.8 [KCrashBackend]

-- Information about the crash:
It automatically crashed when I logged in.

The crash can be reproduced every time.

-- Backtrace:
Application: Baloo File Extractor (baloo_file_extractor), signal: Segmentation fault

[KCrash Handler]
#4  0x00007fcd86a81cfa in QVector<QTextHtmlParserNode>::realloc(int, QFlags<QArrayData::AllocationOption>) () from /lib64/libQt5Gui.so.5
#5  0x00007fcd86a82199 in QVector<QTextHtmlParserNode>::resize(int) () from /lib64/libQt5Gui.so.5
#6  0x00007fcd86a7aa8c in QTextHtmlParser::newNode(int) () from /lib64/libQt5Gui.so.5
#7  0x00007fcd86a803fe in QTextHtmlParser::parseTag() () from /lib64/libQt5Gui.so.5
#8  0x00007fcd86a809a8 in QTextHtmlParser::parse() () from /lib64/libQt5Gui.so.5
#9  0x00007fcd86aa5b02 in QTextHtmlImporter::QTextHtmlImporter(QTextDocument*, QString const&, QTextHtmlImporter::ImportMode, QTextDocument const*) () from /lib64/libQt5Gui.so.5
#10 0x00007fcd86a62de8 in QTextDocument::setHtml(QString const&) () from /lib64/libQt5Gui.so.5
#11 0x00007fcd743215b9 in KFileMetaData::MobiExtractor::extract(KFileMetaData::ExtractionResult*) () from /usr/lib64/qt5/plugins/kf5/kfilemetadata/kfilemetadata_mobiextractor.so
#12 0x000055d3c43ad1b8 in Baloo::App::index(Baloo::Transaction*, QString const&, unsigned long long) ()
#13 0x000055d3c43aef84 in Baloo::App::processNextFile() ()
#14 0x00007fcd864eba9a in QSingleShotTimer::timerEvent(QTimerEvent*) () from /lib64/libQt5Core.so.5
#15 0x00007fcd864decab in QObject::event(QEvent*) () from /lib64/libQt5Core.so.5
#16 0x00007fcd864b41a8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () from /lib64/libQt5Core.so.5
#17 0x00007fcd86505a9b in QTimerInfoList::activateTimers() () from /lib64/libQt5Core.so.5
#18 0x00007fcd865063d1 in idleTimerSourceDispatch(_GSource*, int (*)(void*), void*) () from /lib64/libQt5Core.so.5
#19 0x00007fcd84f134fc in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#20 0x00007fcd84f716b8 in g_main_context_iterate.isra () from /lib64/libglib-2.0.so.0
#21 0x00007fcd84f10b83 in g_main_context_iteration () from /lib64/libglib-2.0.so.0
#22 0x00007fcd86506749 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /lib64/libQt5Core.so.5
#23 0x00007fcd864b2b6b in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /lib64/libQt5Core.so.5
#24 0x00007fcd864badfb in QCoreApplication::exec() () from /lib64/libQt5Core.so.5
#25 0x000055d3c43a505b in main ()
[Inferior 1 (process 2286) detached]

Reported using DrKonqi
Comment 1 tagwerk19 2023-10-17 07:24:59 UTC
(In reply to harveyrasp from comment #0)
> #11 0x00007fcd743215b9 in KFileMetaData::MobiExtractor::extract(KFileMetaData::ExtractionResult*) () from /usr/lib64/qt5/plugins/kf5/kfilemetadata/kfilemetadata_mobiextractor.so
Looks as if baloo is stumbling over an e-book, when trying to extract the text from a Mobi...

You might get a hint if you run "balooctl monitor", it should list the files as baloo_file_extractor indexes them. If one file keeps reappearing, move it out the way and see what happens (I tend to gzip files to do this, it's easy to gunzip them afterwards. You don't have to remember what you moved out the way and where you have to move it back to 8-)
Comment 2 Nate Graham 2023-10-17 17:57:26 UTC
Similar to Bug 473065.
Comment 3 tagwerk19 2023-10-18 21:56:31 UTC
There's a validator for .epub
    https://www.w3.org/publishing/epubcheck/
but I think not one for .mobi
Comment 4 tagwerk19 2023-10-26 06:43:58 UTC
Have a look at Bug 475975

If you can identify the .mobi that is causing the crash, try converting it with Calibre's ebook-convert. Try converting to .epub or reading and writing it back out again as a .mobi.
Comment 5 Stefan Brüns 2023-11-10 14:18:25 UTC
Without a reproducer, this is more or less impossible to fix.

This is *likely* a bug in the underlying QMobipocket library, which is essentially unmaintained. The produced output is garbled and no longer a well-formed XML document, which apparently causes the crash in core Qt code later.

*** This bug has been marked as a duplicate of bug 475975 ***