Application: kfilemetadatareader () KDE Platform Version: 4.7.00 (4.7.0) (Compiled from sources) Qt Version: 4.7.2 Operating System: Linux 3.0.3 x86_64 Distribution: "Fedora release 14 (Laughlin)" -- Information about the crash: - What I was doing when the application crashed: Opening a directory in dolphin. The filenames do contain a lot of Swedish UTF-8 characters and the crash in parsename might be becaus of this... The crash can be reproduced every time. -- Backtrace: Application: (kfilemetadatareader), signal: Aborted [KCrash Handler] #5 0x00007f02e5adc9a5 in raise () from /lib64/libc.so.6 #6 0x00007f02e5ade185 in abort () from /lib64/libc.so.6 #7 0x00007f02e637e08d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6 #8 0x00007f02e637c2a6 in ?? () from /usr/lib64/libstdc++.so.6 #9 0x00007f02e637c2d3 in std::terminate() () from /usr/lib64/libstdc++.so.6 #10 0x00007f02e637c3de in __cxa_throw () from /usr/lib64/libstdc++.so.6 #11 0x00007f02e6326190 in std::__throw_length_error(char const*) () from /usr/lib64/libstdc++.so.6 #12 0x00007f02e635f4da in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::assign(char const*, unsigned long) () from /usr/lib64/libstdc++.so.6 #13 0x00007f02e420f897 in PdfParser::parseName() () from /usr/lib64/libstreamanalyzer.so.0 #14 0x00007f02e4210546 in PdfParser::parseContentStreamObject() () from /usr/lib64/libstreamanalyzer.so.0 #15 0x00007f02e421078a in PdfParser::parseContentStream(Strigi::StreamBase<char>*) () from /usr/lib64/libstreamanalyzer.so.0 #16 0x00007f02e4211092 in PdfParser::handleSubStream(Strigi::StreamBase<char>*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, int) () from /usr/lib64/libstreamanalyzer.so.0 #17 0x00007f02e4210f1d in PdfParser::handleSubStream(Strigi::StreamBase<char>*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, int, bool, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /usr/lib64/libstreamanalyzer.so.0 #18 0x00007f02e4210088 in PdfParser::parseDictionaryOrStream() () from /usr/lib64/libstreamanalyzer.so.0 #19 0x00007f02e4210439 in PdfParser::parseObjectStreamObject(int) () from /usr/lib64/libstreamanalyzer.so.0 #20 0x00007f02e4210af7 in PdfParser::parseObjectStreamObjectDef() () from /usr/lib64/libstreamanalyzer.so.0 #21 0x00007f02e4210c46 in PdfParser::parse(Strigi::StreamBase<char>*) () from /usr/lib64/libstreamanalyzer.so.0 #22 0x00007f02e423e327 in PdfEndAnalyzer::analyze(Strigi::AnalysisResult&, Strigi::StreamBase<char>*) () from /usr/lib64/libstreamanalyzer.so.0 #23 0x00007f02e4218aca in Strigi::StreamAnalyzerPrivate::analyze(Strigi::AnalysisResult&, Strigi::StreamBase<char>*) () from /usr/lib64/libstreamanalyzer.so.0 #24 0x00007f02e42185b2 in Strigi::StreamAnalyzer::analyze(Strigi::AnalysisResult&, Strigi::StreamBase<char>*) () from /usr/lib64/libstreamanalyzer.so.0 #25 0x00007f02e9a1dd8a in KFileMetaInfoPrivate::init (this=0xbc2fb0, stream=..., url=..., mtime=1314006747, w=...) at /usr/local/src/kde/4.7.0/kdelibs-4.7.0/kio/kio/kfilemetainfo.cpp:259 #26 0x00007f02e9a1e068 in KFileMetaInfo::KFileMetaInfo (this=0x7fffd62ab230, path=..., w=...) at /usr/local/src/kde/4.7.0/kdelibs-4.7.0/kio/kio/kfilemetainfo.cpp:288 #27 0x0000000000402f45 in readFileMetaData (urls=...) at /usr/local/src/kde/4.7.0/kdelibs-4.7.0/kio/kfile/kfilemetadatareaderprocess.cpp:85 #28 0x00000000004036b4 in readFileAndContextMetaData (urls=...) at /usr/local/src/kde/4.7.0/kdelibs-4.7.0/kio/kfile/kfilemetadatareaderprocess.cpp:146 #29 0x0000000000403fdc in main (argc=2, argv=0x7fffd62ab8f8) at /usr/local/src/kde/4.7.0/kdelibs-4.7.0/kio/kfile/kfilemetadatareaderprocess.cpp:195 Reported using DrKonqi
It crashes when parsing a PDF file. If you can find the PDF file that causes the crash, please attach it.
Created attachment 63140 [details] File that crashed it. This is the only pdf file in that directory. Publicly available from the government here.
Created attachment 64235 [details] New crash information added by DrKonqi kfilemetadatareader () on KDE Platform 4.7.1 (4.7.1) using Qt 4.7.2 - What I was doing when the application crashed: hoovered over a pdf file, no special chars in the filename afaics -- Backtrace (Reduced): #14 0x00007f78c16c130d in PdfParser::parseName() () from /usr/lib64/libstreamanalyzer.so.0 #15 0x00007f78c16c1a73 in PdfParser::parseDictionaryOrStream() () from /usr/lib64/libstreamanalyzer.so.0 #16 0x00007f78c16c224a in PdfParser::parseObjectStreamObject(int) () from /usr/lib64/libstreamanalyzer.so.0 #17 0x00007f78c16c2450 in PdfParser::parseObjectStreamObjectDef() () from /usr/lib64/libstreamanalyzer.so.0 #18 0x00007f78c16c2518 in PdfParser::parse(Strigi::StreamBase<char>*) () from /usr/lib64/libstreamanalyzer.so.0
I have the same problem with a completely different PDF file (which is >80 MB, so I won't attach it). Some minor tests using the provided pdf file showed: 1. The file name doesn't matter. If I rename the file to "a.pdf" nothing changes. 2. Your file results in the same crash for me. 3. The problem seems to be related to the file itself, because if you use "pdftk Lantmateriforrattning.pdf output a.pdf" to read and write back the file, the newly created file doesn't provoke the crash. (pdftk's man page calls this "Repair a PDF's corrupted XREF table and stream lengths, if possible".) 4. The strangest thing happen when you apply the same behaviour to the previously created file... it will provoke the crash again. Any further iteration still provokes the crash. And every time the file changes... (which could be a problem in pdftk) 5. The first iteration increases the file size from 17.3 KiB to 18.6 KiB, the next iteration decreases the file size by 6 Byte, and any further iteration doesn't change the file size. But nonetheless the SHA sums keep changing. 6. The metadata doesn't contain fancy unicode characters (in fact: "pdftk Lantmateriforrattning.pdf dump_data_utf8 | enca -L none" gives you "7bit ASCII characters") 7. The dumped data (pdftk ... dump_data ...) produces the same output for each iteration. 8. A binary diff shows that the difference between the files is quite large. Actually it seems like the complete pdf is rebuild. 9. If you run xmlindexer manually, it doesn't crash. But, it reports "Error in parsing: Keyword obj not found". This is even reported for the non-crashing version and other pdf files, that don't produce any kind of problem. 10. If you run kfilemetadatareader manually, it does crash. The error comes from a std::string::assign() call, which throws a std::length_error exception. This might happen if the size parameter is negative. The relevant source code seems to be located in file "lib/pdf/pdfparser.cpp", function "PdfParser::parseName()", line 264. Especially the line "lastName.assign(s, pos-s);" looks suspicious to me. First, I guess the lines "skipNotFromString("()<>[]{}/%\t\n\f\r ", 16)" should use 15, shouldn't they? Or even better, the functions wouldn't need manual length parameters. Second, isn't it possible that the StreamStatus r is "Eof"? Third, parseName() uses skipNotFromString(), which uses checkForData(), which uses read(), which uses stream->read(start, min, max). The documentation of StreamBase::read() states: "@param start pointer passed by reference that will be set to point to the retrieved array of items. If the end of the stream is encountered or an error occurs, the value of @p start is undefined". Are you sure that start is still valid? For instance, a start > pos would result in a negative size for the assign() call. I think, a quick check for "pos-s > 0" would be great, if there is not a better understanding of the problem itself... Fourth, I don't fully understand what the code is doing :( I saw that in revision 244e3949c8d1ef2c99119ca3ce6f18aa32199d3e Vishesh Handa started writing a poppler based pdf parser. First, does it fully replace the part that causes these troubles? And second, are you going to fix the old one anyway?
Created attachment 64504 [details] New crash information added by DrKonqi kfilemetadatareader () on KDE Platform 4.7.2 (4.7.2) using Qt 4.7.4 - What I was doing when the application crashed: Double click on a PDF file in dolphin. Adobe reader opened as expected, but DrKonqi opened as well. I do not think it matters, but here are the properties of the PDF I opened: Author: VMware, Inc. Application: XSL Formatter V4.3 MR5 for Windows Creator: Antenna House PDF Output Library 2.6.0 (Windows) PDF-Version: 1.4 (Acrobat 5.x) 5 BitMap fonts and 6 TTF-Fonts are embedded. -- Backtrace (Reduced): #14 0x00007f9c6103223d in PdfParser::parseName (this=0x14aba60) at /var/tmp/portage/app-misc/strigi-0.7.6/work/strigi-0.7.6/libstreamanalyzer/lib/pdf/pdfparser.cpp:274 #15 0x00007f9c610329b3 in PdfParser::parseDictionaryOrStream (this=0x14aba60) at /var/tmp/portage/app-misc/strigi-0.7.6/work/strigi-0.7.6/libstreamanalyzer/lib/pdf/pdfparser.cpp:316 #16 0x00007f9c61033241 in PdfParser::parseObjectStreamObject (this=0x14aba60, nestDepth=0) at /var/tmp/portage/app-misc/strigi-0.7.6/work/strigi-0.7.6/libstreamanalyzer/lib/pdf/pdfparser.cpp:434 #17 0x00007f9c61032a50 in PdfParser::parseDictionaryOrStream (this=0x14aba60) at /var/tmp/portage/app-misc/strigi-0.7.6/work/strigi-0.7.6/libstreamanalyzer/lib/pdf/pdfparser.cpp:331 #18 0x00007f9c61033241 in PdfParser::parseObjectStreamObject (this=0x14aba60, nestDepth=0) at /var/tmp/portage/app-misc/strigi-0.7.6/work/strigi-0.7.6/libstreamanalyzer/lib/pdf/pdfparser.cpp:434
Created attachment 64583 [details] New crash information added by DrKonqi kfilemetadatareader () on KDE Platform 4.7.2 (4.7.2) using Qt 4.7.4 I have the same crash with a PDF which you can download here: http://s3.amazonaws.com/dbclass-resources/docs/pdfs/CourseIntro.pdf Strigi: 0.7.6 -- Backtrace (Reduced): #14 0x00007fb99052964f in PdfParser::parseName (this=0x22d9570) at /var/abs/local/strigi/src/strigi-0.7.6/libstreamanalyzer/lib/pdf/pdfparser.cpp:274 #15 0x00007fb99052a3e4 in parseObjectStreamObject (nestDepth=0, this=0x22d9570) at /var/abs/local/strigi/src/strigi-0.7.6/libstreamanalyzer/lib/pdf/pdfparser.cpp:431 #16 PdfParser::parseObjectStreamObject (this=0x22d9570, nestDepth=0) at /var/abs/local/strigi/src/strigi-0.7.6/libstreamanalyzer/lib/pdf/pdfparser.cpp:417 #17 0x00007fb990529d8c in PdfParser::parseDictionaryOrStream (this=0x22d9570) at /var/abs/local/strigi/src/strigi-0.7.6/libstreamanalyzer/lib/pdf/pdfparser.cpp:331 #18 0x00007fb99052a3f8 in parseObjectStreamObject (nestDepth=0, this=0x22d9570) at /var/abs/local/strigi/src/strigi-0.7.6/libstreamanalyzer/lib/pdf/pdfparser.cpp:434
As this bug is unassigned here, I added a bug report in the strigi bug tracker: https://sourceforge.net/tracker/?func=detail&aid=3424381&group_id=171000&atid=856302
Can't reproduce the crash using strigi 0.7.7