Bug 325008 - strange font in information panel in pdf files
Summary: strange font in information panel in pdf files
Alias: None
Product: nepomuk
Classification: Miscellaneous
Component: fileindexer (show other bugs)
Version: 4.11.1
Platform: Archlinux Linux
: NOR normal (vote)
Target Milestone: ---
Assignee: Nepomuk Bugs Coordination
URL: http://s21.postimg.org/pfv5ga6on/scre...
: 324706 (view as bug list)
Depends on:
Reported: 2013-09-17 16:33 UTC by Simon Solinas
Modified: 2013-10-12 19:47 UTC (History)
5 users (show)

See Also:
Latest Commit:
Version Fixed In: 4.11.3

screenshoot (132.37 KB, image/png)
2013-09-17 21:40 UTC, Simon Solinas
problematic file (8.26 KB, application/pdf)
2013-09-17 21:48 UTC, Simon Solinas

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Solinas 2013-09-17 16:33:26 UTC
if I look at the title of a pdf file into the information panel, this consists by the real title followed by a series of strange characters ( perhaps Chinese, Japanese, Korean or unknown)

Reproducible: Sometimes

Steps to Reproduce:
1. open or create a simple pdf file with Calligra Word or LibreOffice Write.
2. select the pdf file 
3. look at the information panel
Comment 1 Frank Reininghaus 2013-09-17 20:59:25 UTC
Thanks for the bug report. Please always include a screenshot when you see something strange in the application. I'm not quite sure if you refer to the preview image (in that case, it would be a problem with the thumbnailer) or to the title of the PDF file (in which case it might be a Nepomuk problem).

It would also be good if you could attach a problematic file, because I could not reproduce any problems with a few test files yet. Thanks for your help!
Comment 2 Simon Solinas 2013-09-17 21:39:50 UTC
Screenshot is present in the URL section  above. Nepomuk is disabled. I don't know how I could create a problematic file in this case.
Comment 3 Simon Solinas 2013-09-17 21:40:29 UTC
Created attachment 82380 [details]

Comment 4 Frank Reininghaus 2013-09-17 21:45:48 UTC
Thanks for the quick reply.

(In reply to comment #2)
> Screenshot is present in the URL section above.

Oops, sorry, I must have missed that! Sorry about that.

If I'm not mistaken, this information inside the Information Panel is provided by Nepomuk even if the indexer is disabled, so I'll reassign.

> I don't know how I could create a problematic file in this case.

Well, if "thisisatest.pdf" does not contain anything private, you could attach it here.
Comment 5 Simon Solinas 2013-09-17 21:48:34 UTC
Created attachment 82381 [details]
problematic file
Comment 6 Christoph Feck 2013-09-17 23:02:26 UTC
Another test file can be fetched from http://www.mabb.de/files/content/document/Foerderung/mabb_Broschuere_OER_in_der_Praxis.pdf

It displays "Title: Offene " followed by many garbage characters (looks like binary), actual title should be "Offene Bildungsresourcen (OER) in der Praxis".
Comment 7 Christoph Feck 2013-09-18 01:21:48 UTC
Interesting detail: If I hover over the PDF from comment #6 forth and back multiple times, the "Title: Offene" is constant, while the garbage that follows it changes randomly, so it looks like the parser references random pointers.
Comment 8 Christoph Feck 2013-10-06 23:07:53 UTC
Comment 9 Christoph Feck 2013-10-06 23:41:51 UTC
Git commit 4a719dc3a0a8ee8e896e56544c2dfa642fd0f037 by Christoph Feck.
Committed on 06/10/2013 at 23:39.
Pushed by cfeck into branch 'KDE/4.11'.

Fix trailing garbage in extracted PDF title
FIXED-IN: 4.11.3
REVIEW: 113138

M  +2    -3    services/fileindexer/indexer/popplerextractor.cpp

Comment 10 Christoph Feck 2013-10-12 19:47:26 UTC
*** Bug 324706 has been marked as a duplicate of this bug. ***