Bug 194299 - correctly rendered text is not copied to clipboard correctly when it contains diacritics
Summary: correctly rendered text is not copied to clipboard correctly when it contains...
Status: RESOLVED NOT A BUG
Alias: None
Product: okular
Classification: Applications
Component: general (show other bugs)
Version: 0.8.3
Platform: unspecified Linux
: NOR normal
Target Milestone: ---
Assignee: Okular developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-27 13:12 UTC by Radu Benea
Modified: 2009-05-29 16:19 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments
document demonstrating the problem (37.20 KB, application/octet-stream)
2009-05-27 13:41 UTC, Radu Benea
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Radu Benea 2009-05-27 13:12:55 UTC
Version:           0.8.3 (using 4.2.3 (KDE 4.2.3), Gentoo)
Compiler:          x86_64-pc-linux-gnu-gcc
OS:                Linux (x86_64) release 2.6.28-gentoo-r5

text containing diacritics like romanian characters ă,â,î,ș,ț and french characters é,à,ù,û,ç although they are rendered correclty in pdf files the text selection tool does not copy them correctly, only the hats, accents or cedillas are copied.
Comment 1 Pino Toscano 2009-05-27 13:18:29 UTC
Can you please attach a sample document showing the issue?
Comment 2 Radu Benea 2009-05-27 13:41:57 UTC
Created attachment 34044 [details]
document demonstrating the problem

attached file demonstrating the problem, I don't know how to add french to this but I put there romanian and hungarian, this should be enough to prove my point.
Comment 3 Pino Toscano 2009-05-27 14:03:36 UTC
Actually, I get the very same problems with the following PDF viewers:
- Okular + Poppler 0.10.6
- Okular + Poppler HEAD
- Evince + Poppler 0.10.6
- Acrobar Reader 9.1.1
- XPDF 3.02

It looks to me tex system you're using (or how you are using it) generates wrongly-encoded PDF documents.
Comment 4 Radu Benea 2009-05-27 15:41:46 UTC
You seem to be right, I have found a few well written documents myself, thanks for the informantion.

Apparently it's not just me, plenty of pdf documents on the internet have the same problem, example http://tel.archives-ouvertes.fr/docs/00/25/01/37/PDF/these_final.pdf, this last one was not written by me, strangely I have no problem copying chinese text

the tex document started with
\documentclass[a4paper,10pt]{report}

\usepackage[romanian]{babel}
\usepackage{ucs}
\usepackage[utf8x]{inputenc}

and the rest was written plainly in utf8, was saved with encoding utf8 and pdflatex didn't even give a warning except that the romanian language support in babel was missing hyphenation patterns
Comment 5 Pino Toscano 2009-05-29 16:19:06 UTC
Ok, closing this, as it is a problem (= wrong encoding) in generated documents.