Application: nepomukindexer (0.1.0)
KDE Platform Version: 4.10.00
Qt Version: 4.8.3
Operating System: Linux 3.5.0-23-generic x86_64
Distribution: Ubuntu 12.10
-- Information about the crash:
NepomukIndexer has been periodically seg faulting since my update to 4.10 from 4.9.97 earlier this morning. I've made no changes to the configuration since the upgrade.
The crash can be reproduced some of the time.
Application: NepomukIndexer (nepomukindexer), signal: Segmentation fault
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
#5 beginWord (y0=<optimized out>, x0=<optimized out>, state=0x168cdf0, this=0x1665130) at TextOutputDev.cc:2273
#6 TextPage::beginWord (this=0x1665130, state=0x168cdf0, x0=<optimized out>, y0=<optimized out>) at TextOutputDev.cc:2237
#7 0x00007fba1560d675 in TextPage::addChar (this=0x1665130, state=0x168cdf0, x=<optimized out>, y=<optimized out>, dx=<optimized out>, dy=<optimized out>, c=0, nBytes=1, u=0x171d940, uLen=1) at TextOutputDev.cc:2373
#8 0x00007fba156144cc in ActualText::end (this=0x1697a50, state=0x168cdf0) at TextOutputDev.cc:5276
#9 0x00007fba1559d1d1 in Gfx::opEndMarkedContent (this=0x171c0d0, args=<optimized out>, numArgs=<optimized out>) at Gfx.cc:5088
#10 0x00007fba1559e9a4 in Gfx::go (this=this@entry=0x171c0d0, topLevel=topLevel@entry=true) at Gfx.cc:716
#11 0x00007fba1559ee10 in Gfx::display (this=0x171c0d0, obj=<optimized out>, topLevel=<optimized out>) at Gfx.cc:682
#12 0x00007fba155df1d4 in Page::displaySlice (this=0x16cda20, out=0x16169b0, hDPI=<optimized out>, vDPI=<optimized out>, rotate=0, useMediaBox=208, crop=<optimized out>, sliceX=-1, sliceY=-1, sliceW=-1, sliceH=-1, printing=false, abortCheckCbk=0x0, abortCheckCbkData=0x0, annotDisplayDecideCbk=0x0, annotDisplayDecideCbkData=0x0) at Page.cc:519
#13 0x00007fba1591fa6b in Poppler::Page::text (this=0x15b4700, r=..., textLayout=textLayout@entry=Poppler::Page::PhysicalLayout) at poppler-page.cc:354
#14 0x00007fba1591fb6b in Poppler::Page::text (this=<optimized out>, r=...) at poppler-page.cc:374
#15 0x00007fba15b8042e in Nepomuk2::PopplerExtractor::extract (this=<optimized out>, resUri=..., fileUrl=..., mimeType=...) at ../../../../services/fileindexer/indexer/popplerextractor.cpp:104
#16 0x000000000040a412 in Nepomuk2::Indexer::fileIndex (this=this@entry=0x7fff085ee890, uri=..., url=..., mimeType=...) at ../../../../services/fileindexer/indexer/indexer.cpp:146
#17 0x000000000040af40 in Nepomuk2::Indexer::indexFile (this=0x7fff085ee890, url=...) at ../../../../services/fileindexer/indexer/indexer.cpp:101
#18 0x000000000040840e in main (argc=2, argv=0x7fff085ee9f8) at ../../../../services/fileindexer/indexer/main.cpp:113
Reported using DrKonqi
I discovered this was happening while attempting to index one specific .pdf on an NFS share. I moved the file to a local directory the segfault stopped and it indexed the file. I moved the file back to the original NFS directory and it indexed the file without segfaulting.
Possibly close this as a fluke?
Definitely not a fluke. Do you think you could possibly upload that pdf file? The poppler indexer seems to be crashing.
Sorry Vishesh, It's a .pdf of a copyrighted book that was scanned. Not sure I can do that. Is it trying to go into the .pdf to index the contents? It being a scanned book, I'd assume the pages are jpeg or png files.
Would it be possible for you to maybe split the pdf into a number of different pages? Maybe it is just one of the pages. ( http://stackoverflow.com/questions/10228592/splitting-a-pdf-with-ghostscript )
You can run the $ nepomukindexer <fileName> on each of those pages to see if it produces a crash.
Also, does this file open in okular? Cause Okular also uses QtPoppler to render the file.
(In reply to comment #4)
> Would it be possible for you to maybe split the pdf into a number of
> different pages? Maybe it is just one of the pages. ( http://stackoverflow.com/questions/10228592/splitting-a-pdf-with-ghostscript )
> You can run the $ nepomukindexer <fileName> on each of those pages to see if
> it produces a crash.
I split each page into an individual .pdf and wrote a script to index each with nepomukindexer. One page gives an "Error (197): Command token too long" but Nepomuk doesn't crash. It always crashes on the original file when I run nepomukindexer against it.
I rebuilt the file omitting the offending page and nepomukindexer indexes it without crashing and without the error. I then put the page back in and, again, nepomukindexer indexes it without crashing and without the error. Just to verify ghostscript didn't modify anything, I split the page back out of the new file. Running nepomukindexer on the newly extracted page again results in no crash and with the error mentioned above.
> Also, does this file open in okular? Cause Okular also uses QtPoppler to render the file.
Okular displays the file just fine with no errors or warnings, however ghostscript does give the following warnings on the original file only:
**** Warning: File has an invalid xref entry: 13533. Rebuilding xref table.
**** Warning: There are objects with matching object and generation
**** numbers. The accuracy of the resulting image is unknown.
For me, Nepomuk crashes with exact the same backtrace with the difference, that the crash is not related to indexing. In my case the indexing is already finished and everytime when my display gets dark because I am away from keyboard I get a the Nepomuk crasher.
If you need further info or if I should file a new bug report, please contact me.
Relooked into the issue. It seems that Nepomuk try to index a pdf, when the PC is in idle and this PDF (>500 page) document causes the crasher.
Interestingly, this pdf is rendered fine by poppler and this pdf was indexed fine by Nepomuk, before 4.10.
(In reply to comment #7)
> Relooked into the issue. It seems that Nepomuk try to index a pdf, when the
> PC is in idle and this PDF (>500 page) document causes the crasher.
> Interestingly, this pdf is rendered fine by poppler and this pdf was indexed
> fine by Nepomuk, before 4.10.
Please provide the backtrace of running the nepomukindexer on that file. Also, it would be awesome if you could provide me the file either privately, or upload it on bugzilla.
The entire Nepomuk indexing architecture has changed considerably over the course of 4.10.
Created attachment 77400 [details]
Here is the backtrace of the crasher. I will send you the file via email.
Hmm. So both of you have the same backtrace, and the file indexes fine for me. Could you poppler and poppler-qt you have installed?
Mine is -
extra/poppler 0.22.1-1 [installed]
PDF rendering library based on xpdf 3.0
extra/poppler-qt 0.22.1-1 [installed]
Poppler Qt bindings
Seems like its a poppler issue. I get a crasher with Okular on page 366 (document page 397) using poppler(-qt) 0.20.4.
Could you please try updating poppler?
I have the same version of poppler and poppler-qt as TheGhost - 0.20.4. Unfortunately, this seems to be the latest available for (K)Ubuntu 12.10.
I can't say as TheGhost has, that this file indexed fine prior to my 4.10 update. I wasn't using the Nepomuk indexing. I turned it on after the update to see if the memory/system resource problems of past have been ironed out. Everything seems fine other than this one file, so I may continue using it from this point on.
Cool. Marking this as RESOLVED -> UPSTREAM.
@Rohan: Please update the poppler packages for kubuntu.
Acknowledged. I'll update this tomorrow.
Updated to poppler 0.22 and nepomukindexer no longer crashes indexing that file.
For anyone on Ubuntu that wishes to upgrade, I used the build files from Matthieu Baerts made for Raring for poppler 0.22. I grabbed the poppler_0.22.0-0ubuntu0~matttbe2.debian.tar.gz, poppler_0.22.0-0ubuntu0~matttbe2.dsc and poppler_0.22.0.orig.tar.gz files from https://launchpad.net/~matttbe/+archive/ppa/+sourcepub/2903898/+listing-archive-extra.
For poppler-data, I used the Raring packages by Hideki Yamane. I grabbed all four files from https://launchpad.net/ubuntu/+source/poppler-data/0.4.6-2.
To build, just uncompress the source files, then uncompress the .debian.tar.gz into there and place the .dsc file there as well. Run 'dpkg-buildpackage -rfakeroot -uc -b' form within the source directory and it will build the source and place a .deb file one directory up.
I'm not sure what the ai0 files are, but they're included in the original poppler 0.20 for Ubuntu, so I made sure to uncompress that into the poppler-data source directory as well.
To make sure you have the required dependences to build poppler, use sudo apt-get build-dep poppler to install them. Poppler-data has no build dependencies.
Thank you Vishesh.
*** Bug 315732 has been marked as a duplicate of this bug. ***