Bug 173519 - Kate silently corrupts iso-8859-1 files in utf-8 locale
Summary: Kate silently corrupts iso-8859-1 files in utf-8 locale
Status: RESOLVED FIXED
Alias: None
Product: kate
Classification: Applications
Component: encoding (show other bugs)
Version: unspecified
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: KWrite Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-10-25 16:47 UTC by usrrgt
Modified: 2010-03-07 11:35 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description usrrgt 2008-10-25 16:47:15 UTC
Version:            (using KDE 4.1.0)
OS:                Linux
Installed from:    Ubuntu Packages

If I open a binary file in kate, it gives a warning that saving the file will corrupt it. Fine.

However, if I work in a utf-8 locale, open an iso-8859-1 encoded file in kate, and save it, it will be corrupted. Each non-ASCII character (that does not happen to form a valid utf-8 encoded character) is changed into an "invalid character" marker. Simply opening and saving the file is enough to lose all non-ASCII characters, and no warning is given.

Switching to the right encoding after opening the file naturally helps, if only one remembers to do it. If the files are in English, there may be only a few non-ASCII characters in the files, making them easy to miss until it is too late.

More info: https://bugs.launchpad.net/bugs/60670
Comment 1 Claudio 2009-04-07 22:47:53 UTC
Any plans to fix this bug any time?

I am using Kate 3.2.2 (KDE 4.2.2) and it is unbelievable that still I have to manually convert the file from ISO-8859-1 to UTF-8 if I want to edit it with kate.

UTF8 files are opened properly, but ISO-8859 files show strange symbols instead of local characters. Besides the file is opened in read-only mode, and if I manually change the encoding to ISO-8859-1, the tildes do not work (for example, é is typed as ´e).
Comment 2 Winfried 2009-07-21 10:17:13 UTC
The behaviour when opening files is inconsistent, too: If you open a file with ISO_8859-1 or CP850 encodings through the 'open'-dialog of Kate you  will get a warning about incorrect characters and a hint that the file will be opened 'read only'. 
When opening by single/double click in Dolphin it is is also opened 'read only' but without any warning or hint.

It would be a great improvement if Kate could recognize the most frequent encodings (ISO 8859-1, Windows-1252, CP850, CP437) automaticly (without headers) when opening a file.


Using OKTETA is even worse: You have no chance to open a file by using the right encoding from the beginning, because the 'encoding'-field is missing in the 'open'-dialog of OKTETA. It's possible to change the view on the characters by using 'view'->'characters', but the status is still 'read only' although the 'read only'-flag in the 'file'-menu isn't set. Regardless of the way you opened the file there's never a warning or hint.

Summary for Okteta : Files containing non-UTF-8 characters are always opened 'read-only' in OKTETA, there's no hint/warning about that and you can change the view but you cannot switch off the 'read only' status. So there's no chance to edit those files!

Please tell me if I'm wrong!
Comment 3 Winfried 2009-07-21 10:33:34 UTC
P.S.: I'm using Kate 3.22 and Okteta 0.21 under KDE 4.22 (Kubuntu jaunty)
Comment 4 Christoph Cullmann 2010-03-07 11:35:18 UTC
Fixed in KDE 4.5, sorry, unable to backport this, as I have rewritten all parts affecting this.
Now you will get a warning, that the encoding is wrong (or, if you have not altered the fallback encoding, kate will just load it as latin-15, which won't kill anything).

The new basic idea is:

1. try standard encoding
2. if that not works out, try to detect encoding by BOM or use fallback encoding (default is latin-15)