Bug 232814 - Nepomuk doesn't log and handle strigi crashes sensibly
Summary: Nepomuk doesn't log and handle strigi crashes sensibly
Status: RESOLVED DUPLICATE of bug 232398
Alias: None
Product: nepomuk
Classification: Miscellaneous
Component: general (show other bugs)
Version: unspecified
Platform: Debian testing Linux
: NOR normal
Target Milestone: ---
Assignee: Sebastian Trueg
URL:
Keywords:
: 232395 232402 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-03-31 11:50 UTC by Michael Schuerig
Modified: 2011-01-06 20:13 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Schuerig 2010-03-31 11:50:58 UTC
Version:            (using KDE 4.4.1)
OS:                Linux
Installed from:    Debian testing/unstable Packages

I had a PDF file from which pdftotext apparently extracted malformed UTF-8 text which causes an assert failure in strigi (LineEventAnalyzer::handleUtf8Data). Strigi, in turn aborts with a SIGABRT.

Nepomuk doesn't log an error nor does it in any other way indicate that something went wrong -- which makes it rather hard to realize that something went wrong in the first place and even harder to track down the file causing the problem.

Currently, Nepomuk just restarts strigi and starts over from the beginning. Instead it would be much better to

- Notify the user that a file is causing problems.
- Exclude this file from indexing as long as it remains unchanged.
- Resume indexing after the offending file.

See also
https://sourceforge.net/tracker/?func=detail&atid=856302&aid=2979889&group_id=171000
for a ticket asking strigi to be more helpful when encountering malformed UTF-8.
Comment 1 Martin Steigerwald 2010-03-31 20:42:38 UTC
This is similar to bug #232395 I reported some days ago, except that in my case, strigi nepomuk services complains loudly of too many crashes in ~/.xsession-errors. Did you grep your ~/.xsession-errors just for the word "crash"?

But you put emphasis on how nepomuk handles those crashes. I did so as well already in bug #232398.

I think your bug report contains two bug reports. I suggest you to add here all information on the UTF-8 related crashes you encounter as I am not yet sure, whether you are seeing a duplicate of bug #232395, your description sounds different. And to add your suggestions how Nepomuk should handle those crashes in bug #232398 or report a new wish if your suggestions differ.
Comment 2 Michael Schuerig 2010-03-31 20:50:35 UTC
Martin, I don't agree with your assessment.

Your problem and bug report is concerned with behavior that occurs in the storage backend Nepomuk uses. In contrast, this bug report is about a problem in the frontend used for extracting data from files.
Comment 3 Martin Steigerwald 2010-03-31 20:59:19 UTC
Ok, seems you have a clearer understanding. Point taken, bugs are linked, may the Sebastian Trueg or some other strigi / nepomuk developer finally decide on similarity.
Comment 4 Sebastian Trueg 2010-04-07 16:24:08 UTC
I agree that this needs to be done since there are too many cases in which strigi crashes. Are there any takers for this bug?
Comment 5 Sebastian Trueg 2010-07-23 10:56:11 UTC
*** Bug 232402 has been marked as a duplicate of this bug. ***
Comment 6 Sebastian Trueg 2010-07-23 10:57:39 UTC
*** Bug 232395 has been marked as a duplicate of this bug. ***
Comment 7 Sebastian Trueg 2011-01-06 20:13:09 UTC

*** This bug has been marked as a duplicate of bug 232398 ***