Summary: | enable full text search for non-ASCII characters | ||
---|---|---|---|
Product: | [Unmaintained] kdvi | Reporter: | Oliver Grimm <logistikka> |
Component: | general | Assignee: | Unassigned bugs mailing-list <unassigned-bugs> |
Status: | RESOLVED UNMAINTAINED | ||
Severity: | wishlist | CC: | adaptee |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Debian testing | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: |
Description
Oliver Grimm
2005-04-24 22:54:53 UTC
Isn't the problem the fact that those characters simply aren't there in the .dvi file? I have just checked a .dvi file of mine in khexedit, and I can find all of the ASCII text in it, but not the non-ASCII. It is the case with PDFs as well, see bug #103621. Yes, it is obviously the same problem with PDF files. Non-ASCII chars are expressed as a multi-byte code and not as a single symbol from a codepage. It seems to be the same problem for ligatures, umlauts and other accented characters. Unfortunately I don't know enough about DVI coding or PDF coding to inquire further here. The problem is not multibyte coding of one character. The problem is that there is more than one character, superimposed, that composes the glyph you see. A similar effect would be obtained with the following HTML excerpt: <tt> <p style="position: absolute; top: 1em; left: 1em">Jos´</p> <p style="position: absolute; top: 3em">Yadda yadda yadda</p> <p style="position: absolute; top: 1em; left: 1em"> e</p> </tt> You'll see "é", even when there is no such character in the file. This is exactly what LaTeX does when my source .tex contained "\'e", and I can bet the .dvi contains similar stuff. kdvi is no longer maintained since KDE SC 4, and its functionality is replace by okular. If the issue in this report still exists in or apply to okular in KDE SC 4.10.5 or higher, please reassign the report to okular product or create a new report against okular. |