Bug 323811

Summary: nepomukindexer crashes with the attached epub.
Product: nepomuk Reporter: Thiago Jung Bauermann <thiago.bauermann>
Component: fileindexerAssignee: Nepomuk Bugs Coordination <nepomuk-bugs>
Status: RESOLVED FIXED    
Severity: crash CC: nepomuk-bugs
Priority: NOR    
Version: 4.11.0   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In: 4.11.1
Attachments: Problematic file.

Description Thiago Jung Bauermann 2013-08-21 00:26:10 UTC
Created attachment 81828 [details]
Problematic file.

I have more than a thousand crashes of nepomukindexer registered in my dmesg output:

hactar% dmesg G nepomukindexer | tail
[29197.174797] nepomukindexer[28161]: segfault at 0 ip 42e7af81 sp bf816d4c error 4 in libc-2.15.so[42dfd000+1a3000]
[29200.463272] nepomukindexer[28190]: segfault at 0 ip 42e7af81 sp bfd808ac error 4 in libc-2.15.so[42dfd000+1a3000]
[29203.755407] nepomukindexer[28217]: segfault at 0 ip 42e7af81 sp bf924b5c error 4 in libc-2.15.so[42dfd000+1a3000]
[29207.057211] nepomukindexer[28224]: segfault at 0 ip 42e7af81 sp bfcb067c error 4 in libc-2.15.so[42dfd000+1a3000]
[29210.359251] nepomukindexer[28247]: segfault at 0 ip 42e7af81 sp bfb2419c error 4 in libc-2.15.so[42dfd000+1a3000]
[29213.661917] nepomukindexer[28274]: segfault at 0 ip 42e7af81 sp bfb127ac error 4 in libc-2.15.so[42dfd000+1a3000]
[29216.966522] nepomukindexer[28281]: segfault at 0 ip 42e7af81 sp bfc4224c error 4 in libc-2.15.so[42dfd000+1a3000]
[29220.489354] nepomukindexer[28304]: segfault at 0 ip 42e7af81 sp bfd745fc error 4 in libc-2.15.so[42dfd000+1a3000]
[29223.788106] nepomukindexer[28333]: segfault at 0 ip 42e7af81 sp bfbb917c error 4 in libc-2.15.so[42dfd000+1a3000]
[29227.080128] nepomukindexer[28341]: segfault at 0 ip 42e7af81 sp bfd5bf5c error 4 in libc-2.15.so[42dfd000+1a3000]

In ~/.xsession-errors I have a lot of entries like this:

nepomukindexer(28543)/nepomuk (library) Nepomuk2::ResourceManagerPrivate::_k_storageServiceInitialized: Nepomuk Storage service up and initialized.
nepomukindexer(28543)/nepomuk (strigi service) Nepomuk2::Indexer::indexFile:  QUrl( "nepomuk:/res/f507b5d9-f08e-47ea-b8a8-e0970bafdc22" )  "application/epub+zip"
TOC:1: parser error : Entity 'nbsp' not defined
          <text>&nbsp;</text>
                      ^
libepub (EE):   failed to parse toc
nepomukindexer(28543)/nepomuk (strigi service) Nepomuk2::ExtractorPlugin::dateTimeFromString: Could not determine correct datetime format from: "creation: 2011-02-22;publication: 1999" 

Where nepomuk:/res/f507b5d9-f08e-47ea-b8a8-e0970bafdc22 is the file I am attaching to this bug report. Running nepomukindexer manually gives:

hactar% nepomukindexer The-Last-Ring-bearer-Kirill-Yeskov.epub
nepomukindexer(25560)/nepomuk (library) Nepomuk2::ResourceManagerPrivate::_k_storageServiceInitialized: Nepomuk Storage service up and initialized.
nepomukindexer(25560)/nepomuk (strigi service) Nepomuk2::Indexer::indexFile:  QUrl( "nepomuk:/res/f507b5d9-f08e-47ea-b8a8-e0970bafdc22" )  "application/epub+zip"
TOC:1: parser error : Entity 'nbsp' not defined
          <text>&nbsp;</text>
                      ^
libepub (EE):   failed to parse toc
nepomukindexer(25560)/nepomuk (strigi service) Nepomuk2::ExtractorPlugin::dateTimeFromString: Could not determine correct datetime format from: "creation: 2011-02-22;publication: 1999" 
zsh: segmentation fault (core dumped)  nepomukindexer The-Last-Ring-bearer-Kirill-Yeskov.epub

I am using KDE 4.11.0 on KUbuntu 12.04.2 LTS.
Comment 1 Simeon Bird 2013-08-27 22:37:09 UTC
Git commit 40ebf74c005c9b7c602d1e2ab1f8af96a9415a29 by Simeon Bird.
Committed on 27/08/2013 at 22:29.
Pushed by sbird into branch 'KDE/4.11'.

epubextractor: Fix a potential crash.

While I can't reproduce this crash, there is the possibility for a
pointer to be null in about the right place. Hopefully this fixes the
crash.
FIXED-IN: 4.11.1

M  +4    -1    services/fileindexer/indexer/epubextractor.cpp

http://commits.kde.org/nepomuk-core/40ebf74c005c9b7c602d1e2ab1f8af96a9415a29