Bug 165804

Summary: hebrew text in pdf file is copied in reverse order
Product: [Applications] okular Reporter: Nadav Kavalerchik <nadavkav>
Component: generalAssignee: Okular developers <okular-devel>
Status: RESOLVED DUPLICATE    
Severity: normal    
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: hebrew rtl text pdf file

Description Nadav Kavalerchik 2008-07-05 18:24:19 UTC
Version:           0.6.80 (using 4.00.84 (KDE 4.0.84 (KDE 4.1 >= 20080625), Debian packages)
Compiler:          cc
OS:                Linux (i686) release 2.6.25-trunk-686

i open an rtl Hebrew pdf file in okular and select some rtl Hebrew text with the selection tool.

i copy it to the clipboard and paste it info a different application
(firefox 3 text area, for example) and the text is pasted in reverse character order. ("hello" >> "olleh" , for example)

see similar issue:
http://bugs.kde.org/show_bug.cgi?id=156380

but not the same !
Comment 1 Pino Toscano 2008-07-05 18:39:53 UTC
Could you please provide an example document that shows the problem?
Comment 2 Nadav Kavalerchik 2008-07-05 20:23:27 UTC
Created attachment 25860 [details]
hebrew rtl text pdf file

this is a sample rtl hebrew text pdf single page file 
from which you can (try) to select some text and paste it
into kwrite (for example) and see that it is in reverse character order.
Comment 3 Diego Iastrubni 2008-08-13 08:52:17 UTC
Pino, IMHO this is a dup of 156380 http://bugs.kde.org/show_bug.cgi?id=156380.

Nadav, Hebrew in PDF is stored in visual mode (not logical). This means Okular needs to support a visual->logical convertion which will be broken in many ways (logical->visual is deterministic but visual->logical is not, because of paragraph direction and hidden bidi control chars missing).

Comment 4 Diego Iastrubni 2008-08-13 09:45:12 UTC

*** This bug has been marked as a duplicate of 128609 ***
Comment 5 Nadav Kavalerchik 2008-08-13 18:01:59 UTC
diego: do you know if the pdf specs support bidi and unicode standards ?
and if it is just a "simple" (joking !) matter if implementating bidi support in okular or is it more serious matter of asking for a new revision in the pdf standard to support bidi ?