Bug 135580 - encoding problem in hungarian_expert lecture
Summary: encoding problem in hungarian_expert lecture
Alias: None
Product: ktouch
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Haavard Froeiland
Depends on:
Reported: 2006-10-13 13:10 UTC by Egmont Koblinger
Modified: 2006-11-23 09:54 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Note You need to log in before you can comment on or make changes to this bug.
Description Egmont Koblinger 2006-10-13 13:10:34 UTC
Version:            (using KDE KDE 3.5.5)
Installed from:    Compiled From Sources
OS:                Linux

hungarian_expert.ktouch.xml (the "Hungarian (auto-generated)" lecture) is in wrong encoding.

It contains plenty of otilde and ucircumflex characters (which are unused in Hungarian) where it should contain odoubleacute and udoubleacute instead.

Hence these characters look only similar to the real one (so they look disgusting), and they cannot be typed from the keyboard using Hungarian layout.

The phenomenon is quite typical: someone probably converted an 8-bit text file to UTF-8 assuming ISO-8859-1 as the source charset, which was a wrong assumption, the old-fashioned charset for Hungarian is ISO-8859-2.

To fix the file, just convert it from UTF-8 to ISO-8859-1 and then from ISO-8859-2 to UTF-8, it will be okay. Example:
cat hungarian_expert.ktouch.xml | iconv -f utf8 -t latin1 | iconv -f latin2 -t utf8 > new.xml
mv new.xml hungarian_expert.ktouch.xml

Another way to fix the file is to do a search-replace for the characters mentioned above, as well as for their uppercase counterparts.

The title of the lecture says it's an auto-generated file. I don't know how this was generated since words from this file are not found anywhere else in ktouch's source. If the generating script is still available somewhere and there's some chance that it'll be used in the future then this script needs to be fixed too.
Comment 1 Anne-Marie Mahfouf 2006-11-23 09:49:49 UTC
SVN commit 607116 by annma:

Fix in 3.5.5 branch encoding of this file following instructions given by Egmont Koblinger in the bug report
Thanks a lot Egmont for having investigated and providing the fixe

 M  +138 -138  hungarian_expert.ktouch.xml  
Comment 2 Anne-Marie Mahfouf 2006-11-23 09:51:53 UTC
The file is installed as such so even if it is written "auto-generated" I don't think it's the case.
Comment 3 Anne-Marie Mahfouf 2006-11-23 09:54:20 UTC
SVN commit 607117 by annma:

forward port of 607116

 M  +138 -138  hungarian_expert.ktouch.xml