Version: 2.5.2 (using KDE KDE 3.5.2) Installed from: Gentoo Packages Compiler: gcc (GCC) 4.1.1 (Gentoo 4.1.1) OS: Linux Steps to reproduce the problem with kdevelop 3.3.5 using kate as editor component (simpler than with kate, see below): 1. Create a unicode file starting with: #!/usr/bin/env python # -*- coding: utf-8 -*- kate: encoding utf-8; # Zürich 2. Save the file as utf8. 3. Start kdevelop with the kate editor component. Open the file with the File Selector. Expected behaviour: "# Zürich" is displayed Observed behaviour: "# Zürich" is displayed, kate: encoding is ignored 4. Reload the file, e.g. with F5. "# Zürich" is correctly displayed The same happens when opening the file from the bookmarks list, but not when opening the file via the file File->Open dialog. Kate itself behaves even more tricky: The problem happens only the first time an unicode file is opened with kate with the file selector. Steps to reproduce: 1.-4. as with kdevelop 5. Close the file displaying "# Zürich". 6. Open the file again with the file selector. "# Zürich" is displayed, the bug seems to have vanished. 7. Copy the correctly displayed file to a new one using the console. 8. Open the file "# Zürich" is displayed, bug is there again. Kate itself somehow seems to remember the encoding somewhere, while kdevelop does not. Expected behaviour would be to honour "kate: encoding utf-8;" in any case, especially because saving the file as utf-8 works reliably: "Zürich" is saved literally as utf8, therefore effectively corrupting the file.
Here a little addendum: The default encoding needs to be non-utf8 for the bug to be reproducable (I have iso 8859-1). Kwrite (the group this bug was assigned to) seems to be too late when recognizing the "kate: encoding utf-8;" keyword when opening *the first* utf-8 file with the File -> Open dialog, but has recognized utf-8 when trying "save as". Steps to reproduce (I have KWrite 4.5.2 for KDE 3.5.2): 1. Open KWrite 2. In the settings for file open/close set default encoding to iso 8859-1 3. Close KWrite (this is important if there is an open utf-8 file) 4. Open KWrite, default encoding must be still iso 8859-1 5. Open the example file with File -> Open. Ensure that the default encoding in the open dialog is still iso 8859-1. 6. Verify that the file content is rendered as ASCII (instead of utf-8) 7. Open the File -> Save As dialog. It shows correctly utf8, thus has recognized the encoding *after* opening and rendering it. 8. Open the File -> Open dialog. It shows now utf8, overriding the default encoding with the encoding of the file already open. Step 8. seems to make the bug hide itself.
I believe this is a known problem. The file dialog needs a <default> setting for encoding, allowing to let the setting in the file take precedence over an embedded variable. If you open a file on the commandline, the variable in fact is taking precedence, unless overridden on the commandline.
The problem seems in fact to be known, but I didn't find a corresponding bug: http://lists.kde.org/?l=kde-devel&m=111005445527489&w=2 I can confirm that the commandline --encoding=utf-8 takes precedence over the embedded variable, (which seems logical to me), but on my system the Encoding=ISO 8859-1 setting in the [Kate Document Defaults] section in katerc actually takes precedence over the embedded variable, which is counterintuitive for me. Emacs handles it the oposite way. If called with LANG=de_CH.latin-1 emacs uni.py emacs describes itself like this: Priority order for recognizing coding systems when reading files: 1. iso-latin-1 (alias: iso-8859-1 latin-1) 2. iso-2022-jp (alias: junet) But the -*- coding: utf-8 -*- definition overrides this priority order. If that definition is missing, emacs opens the file as iso-latin-1 according to the priority order. The problem bites me because I'm using kdevelop for different projects usually using latin-1 as encoding, but one of my own projects needs utf-8, and I sometimes forget to press F5 after each time I opened a file within the IDE. After saving my changes, the file is corrupted, and I manually have to merge the utf-8 parts again.
When the encoding selection in katepart was designed, we wanted to let the command line or file dialog setting take precedence over and embedded variable. The embedded variable should be used if none of those are present. The problem--AFAICS is that the file dialog *always* contains a setting, which should not be the case.
*** Bug 142725 has been marked as a duplicate of this bug. ***
Confirming in KDE 3.5.8 and KDE 4.
SVN commit 794503 by alund: set provided encoding (from command line or file dialog) after loading meta data. BUG: 160353 CCBUG: 135581 (does this fix 135581 as well?) M +2 -0 katedocmanager.cpp M +1 -1 kateviewdocumentproxymodel.cpp WebSVN link: http://websvn.kde.org/?view=rev&revision=794503
duplicate of bug #187033, as that one is shorter and still valid for KDE4. *** This bug has been marked as a duplicate of bug 187033 ***