Version: (using Devel) Compiler: gcc 4.5 OS: Linux Installed from: Compiled sources Hi, I guess since strigi 0.7.2 I have a problem with indexing pdf files. Whenever the indexer hits a pdf it hogs up one cpu core and simply hangs. When I do a manual run with strigicmd, I get something like the following log message: '' is not a UTF8 or latin1 string Error in parsing: Keyword obj not found. That's when the process starts hanging.
I think I have the same problem here in KDE SC 4.5 compiled from trunk on 2010-06-05. When I check the process list I see "/home/kde-devel/kde/bin/nepomukservicestub nepomukstrigiservice" using all the CPU. I attached to it with GDB and got this backtrace: (gdb) thread apply all where Thread 2 (Thread 0x7f75cabe6710 (LWP 18485)): #0 0x00000034492c44cd in read () from /lib/libc.so.6 #1 0x000000344926e42f in ?? () from /lib/libc.so.6 #2 0x0000003449263e09 in fread () from /lib/libc.so.6 #3 0x00007f75d0c180d5 in Strigi::SkippingFileInputStream::read(char const*&, int, int) () from /home/kde-devel/kde/lib/libstreams.so.0 #4 0x00007f75d0c01044 in Strigi::DataEventInputStream::read(char const*&, int, int) () from /home/kde-devel/kde/lib/libstreams.so.0 #5 0x00007f75d0eef6bd in PdfParser::read(int, int) () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #6 0x00007f75d0eef7f0 in PdfParser::checkForData(int) () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #7 0x00007f75d0eefa7d in PdfParser::skipNotFromString(char const*, int) () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #8 0x00007f75d0ef0331 in PdfParser::parseName() () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #9 0x00007f75d0ef05aa in PdfParser::parseDictionaryOrStream() () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #10 0x00007f75d0ef0fcb in PdfParser::parseObjectStreamObject(int) () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #11 0x00007f75d0ef1689 in PdfParser::parseObjectStreamObjectDef() () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #12 0x00007f75d0ef17eb in PdfParser::parse(Strigi::StreamBase<char>*) () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #13 0x00007f75d0f289c1 in PdfEndAnalyzer::analyze(Strigi::AnalysisResult&, Strigi::StreamBase<char>*) () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #14 0x00007f75d0efdfdc in Strigi::StreamAnalyzerPrivate::analyze(Strigi::AnalysisResult&, Strigi::StreamBase<char>*) () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #15 0x00007f75d0efdac4 in Strigi::StreamAnalyzer::analyze(Strigi::AnalysisResult&, Strigi::StreamBase<char>*) () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #16 0x00007f75d0ebfdf0 in Strigi::AnalysisResult::index(Strigi::StreamBase<char>*) () from /home/kde-devel/kde/lib/libstreamanalyzer.so.0.7 #17 0x00007f75cbced915 in Nepomuk::IndexScheduler::analyzeFile (this=0x1ecb4b0, file=..., analyzer=0x7f75cabe5dd0) at /home/kde-devel/kde/src/kdebase/runtime/nepomuk/services/strigi/indexscheduler.cpp:429 #18 0x00007f75cbced3b6 in Nepomuk::IndexScheduler::updateDir (this=0x1ecb4b0, dir=..., analyzer=0x7f75cabe5dd0, flags=...) at /home/kde-devel/kde/src/kdebase/runtime/nepomuk/services/strigi/indexscheduler.cpp:395 #19 0x00007f75cbcec9e7 in Nepomuk::IndexScheduler::run (this=0x1ecb4b0) at /home/kde-devel/kde/src/kdebase/runtime/nepomuk/services/strigi/indexscheduler.cpp:296 #20 0x00007f75d52d2570 in QThreadPrivate::start (arg=0x1ecb4b0) at thread/qthread_unix.cpp:266 #21 0x0000003449e068e4 in start_thread () from /lib/libpthread.so.0 #22 0x00000034492d129d in clone () from /lib/libc.so.6 Thread 1 (Thread 0x7f75d1b5b760 (LWP 18409)): #0 0x00000034492c8573 in poll () from /lib/libc.so.6 #1 0x000000344d23e6bc in ?? () from /usr/lib/libglib-2.0.so.0 #2 0x000000344d23ea00 in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0 #3 0x00007f75d544414f in QEventDispatcherGlib::processEvents (this=0x1d6d120, flags=...) at kernel/qeventdispatcher_glib.cpp:412 #4 0x00007f75d2d4f588 in QGuiEventDispatcherGlib::processEvents (this=0x1d6d120, flags=...) at kernel/qguieventdispatcher_glib.cpp:204 #5 0x00007f75d5401e8c in QEventLoop::processEvents (this=0x7fffe0dfa7f0, flags=...) at kernel/qeventloop.cpp:149 #6 0x00007f75d5401fe2 in QEventLoop::exec (this=0x7fffe0dfa7f0, flags=...) at kernel/qeventloop.cpp:201 #7 0x00007f75d5405590 in QCoreApplication::exec () at kernel/qcoreapplication.cpp:1009 #8 0x00007f75d2c60ab0 in QApplication::exec () at kernel/qapplication.cpp:3665 #9 0x0000000000404102 in main (argc=2, argv=0x7fffe0dfad58) at /home/kde-devel/kde/src/kdebase/runtime/nepomuk/servicestub/main.cpp:152
Can't fix this unless I have a copy of the file in question :(
(In reply to comment #2) > Can't fix this unless I have a copy of the file in question :( Actually it was almost any pdf. And indeed it _was_, because since 4.5 release the problem has simply vanished.
Closing as fixed due to the last comment about 4.5