Version: 0.8.3 (using 4.2.3 (KDE 4.2.3), Gentoo) Compiler: x86_64-pc-linux-gnu-gcc OS: Linux (x86_64) release 2.6.28-gentoo-r5 text containing diacritics like romanian characters ă,â,î,ș,ț and french characters é,à,ù,û,ç although they are rendered correclty in pdf files the text selection tool does not copy them correctly, only the hats, accents or cedillas are copied.
Can you please attach a sample document showing the issue?
Created attachment 34044 [details] document demonstrating the problem attached file demonstrating the problem, I don't know how to add french to this but I put there romanian and hungarian, this should be enough to prove my point.
Actually, I get the very same problems with the following PDF viewers: - Okular + Poppler 0.10.6 - Okular + Poppler HEAD - Evince + Poppler 0.10.6 - Acrobar Reader 9.1.1 - XPDF 3.02 It looks to me tex system you're using (or how you are using it) generates wrongly-encoded PDF documents.
You seem to be right, I have found a few well written documents myself, thanks for the informantion. Apparently it's not just me, plenty of pdf documents on the internet have the same problem, example http://tel.archives-ouvertes.fr/docs/00/25/01/37/PDF/these_final.pdf, this last one was not written by me, strangely I have no problem copying chinese text the tex document started with \documentclass[a4paper,10pt]{report} \usepackage[romanian]{babel} \usepackage{ucs} \usepackage[utf8x]{inputenc} and the rest was written plainly in utf8, was saved with encoding utf8 and pdflatex didn't even give a warning except that the romanian language support in babel was missing hyphenation patterns
Ok, closing this, as it is a problem (= wrong encoding) in generated documents.