Application: nepomukindexer (0.1.0) KDE Platform Version: 4.9.95 (Compiled from sources) Qt Version: 4.8.4 Operating System: Linux 3.6.6-1-CHAKRA x86_64 Distribution: "Chakra Linux" -- Information about the crash: While trying to index a specific PDF file (compendioDeLaRiquezadeLasNaciones.pdf), I get this crash and the system tries over and over to index the file. KDE 4.10 RC1-2 (Chakra RC1 with soprano, kdepim-runtime and nepomuk-core compiled from KDE/4.10 branches) The crash can be reproduced every time. -- Backtrace: Application: NepomukIndexer (nepomukindexer), signal: Segmentation fault [KCrash Handler] #5 0x00007f6a1976a841 in Poppler::Page::text(QRectF const&, Poppler::Page::TextLayout) const () from /usr/lib/libpoppler-qt4.so.4 #6 0x00007f6a1976a9bb in Poppler::Page::text(QRectF const&) const () from /usr/lib/libpoppler-qt4.so.4 #7 0x00007f6a199a2fea in Nepomuk2::PopplerExtractor::extract (this=<optimized out>, resUri=<optimized out>, fileUrl=<optimized out>, mimeType=<optimized out>) at /root/nepomuk-core/services/fileindexer/indexer/popplerextractor.cpp:98 #8 0x000000000040a612 in Nepomuk2::Indexer::fileIndex (this=0x7fff1b18b310, uri=..., url=..., mimeType=...) at /root/nepomuk-core/services/fileindexer/indexer/indexer.cpp:146 #9 0x000000000040b170 in Nepomuk2::Indexer::indexFile (this=0x7fff1b18b310, url=...) at /root/nepomuk-core/services/fileindexer/indexer/indexer.cpp:101 #10 0x000000000040860e in main (argc=2, argv=0x7fff1b18b478) at /root/nepomuk-core/services/fileindexer/indexer/main.cpp:113 Reported using DrKonqi
*** Bug 312701 has been marked as a duplicate of this bug. ***
Further duplicates are probably: https://bugs.kde.org/show_bug.cgi?id=312633 https://bugs.kde.org/show_bug.cgi?id=312673
*** Bug 312633 has been marked as a duplicate of this bug. ***
*** Bug 312673 has been marked as a duplicate of this bug. ***
Related to this: the old Nepomuk code skipped every file that led to crashes, but the new code doesn't. Keep that in mind. Thanks for the quick fix, Jörg.
Thanks for looking into it and fixing it.. but: where and how? http://techbase.kde.org/Projects/Nepomuk/Repositories <-- dead links in here..
(In reply to comment #6) > Thanks for looking into it and fixing it.. but: where and how? > http://techbase.kde.org/Projects/Nepomuk/Repositories <-- dead links in > here.. It has been fixed in the nepomuk-core repository.
*** Bug 312818 has been marked as a duplicate of this bug. ***
*** Bug 312864 has been marked as a duplicate of this bug. ***
*** Bug 312922 has been marked as a duplicate of this bug. ***
(In reply to comment #5) > Related to this: the old Nepomuk code skipped every file that led to > crashes, but the new code doesn't. Keep that in mind. That explains why I've never been able to find anything on my pdf files.
*** Bug 312937 has been marked as a duplicate of this bug. ***
*** Bug 313042 has been marked as a duplicate of this bug. ***
(In reply to comment #7) > It has been fixed in the nepomuk-core repository. Has it been fixed so that the files are ignored as previously, or so that they can now be indexed?
(In reply to comment #14) > Has it been fixed so that the files are ignored as previously, or so that > they can now be indexed? Yes the "broken" pdf files will be indexed normally. Except that the plainTextContent will not be available and the new title extarction method does not work.
(In reply to comment #15) I checked out the code changes and it seems that basically the analysis of the page in question is cancelled. I'm wondering why the crash happens at all, I have a lot of PDFs for the Pathfinder Roleplaying Game I'd like to be able to do a full text search on. Text extraction in Okular with Poppler backend works fine in these files, and Spotlight on Mac OS X indexes them without problems. Is it possible to determine from the crash data which file actually causes the problems?
(In reply to comment #16) > [...] Text extraction in Okular with Poppler > backend works fine in these files, and Spotlight on Mac OS X indexes them > without problems. The testfile I got showed just a black page when opened with Okular. So I assume your Pathfidner pdfs will be fine. (In reply to comment #16) > Is it possible to determine from the crash data which file > actually causes the problems? Now there is, I have added kWarning outputs to the places where the extraction was skipped now. Sou you can easily run nepomukfileindexer <folder/file> and check the output.