Bug 135581 - kate: encoding utf-8; file opened as ASCII via file selector
Summary: kate: encoding utf-8; file opened as ASCII via file selector
Status: RESOLVED DUPLICATE of bug 187033
Alias: None
Product: kate
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Gentoo Packages Linux
: NOR normal
Target Milestone: ---
Assignee: KWrite Developers
URL:
Keywords:
: 142725 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-10-13 13:22 UTC by Toni Arnold
Modified: 2009-06-04 23:33 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Toni Arnold 2006-10-13 13:22:56 UTC
Version:           2.5.2 (using KDE KDE 3.5.2)
Installed from:    Gentoo Packages
Compiler:          gcc (GCC) 4.1.1 (Gentoo 4.1.1) 
OS:                Linux

Steps to reproduce the problem with kdevelop 3.3.5 using kate as editor component (simpler than with kate, see below): 

1. Create a unicode file starting with:
#!/usr/bin/env python
# -*- coding: utf-8 -*-  kate: encoding utf-8;
# Zürich

2. Save the file as utf8.

3. Start kdevelop with the kate editor component. Open the file with the File Selector.
Expected behaviour: "# Zürich" is displayed
Observed behaviour: "# Zürich" is displayed, kate: encoding is ignored

4. Reload the file, e.g. with F5.
"# Zürich" is correctly displayed

The same happens when opening the file from the bookmarks list, but not when opening the file via the file File->Open dialog.
Kate itself behaves even more tricky: The problem happens only the first time an unicode file is opened with kate with the file selector. Steps to reproduce:

1.-4. as with kdevelop

5. Close the file displaying "# Zürich".

6. Open the file again with the file selector.
"# Zürich" is displayed, the bug seems to have vanished.

7. Copy the correctly displayed file to a new one using the console.

8. Open the file
"# Zürich" is displayed, bug is there again.

Kate itself somehow seems to remember the encoding somewhere, while kdevelop does not. Expected behaviour would be to honour "kate: encoding utf-8;" in any case, especially because saving the file as utf-8 works reliably: "Zürich" is saved literally as utf8, therefore effectively corrupting the file.
Comment 1 Toni Arnold 2006-10-22 15:14:45 UTC
Here a little addendum: The default encoding needs to be non-utf8 for the bug to be reproducable (I have iso 8859-1). Kwrite (the group this bug was assigned to) seems to be too late when recognizing the "kate: encoding utf-8;" keyword when opening *the first* utf-8 file with the File -> Open dialog, but has recognized utf-8 when trying "save as".

Steps to reproduce (I have KWrite 4.5.2 for KDE 3.5.2):

1. Open KWrite
2. In the settings for file open/close set default encoding to iso 8859-1
3. Close KWrite (this is important if there is an open utf-8 file)
4. Open KWrite, default encoding must be still iso 8859-1
5. Open the example file with File -> Open. Ensure that the default encoding in the open dialog is still iso 8859-1.
6. Verify that the file content is rendered as ASCII (instead of utf-8)
7. Open the File -> Save As dialog. It shows correctly utf8, thus has recognized the encoding *after* opening and rendering it.
8. Open the File -> Open dialog. It shows now utf8, overriding the default encoding with the encoding of the file already open.

Step 8. seems to make the bug hide itself.
Comment 2 Anders Lund 2006-10-24 20:13:19 UTC
I believe this is a known problem.

The file dialog needs a <default> setting for encoding, allowing to let the 
setting in the file take precedence over an embedded variable. If you open a 
file on the commandline, the variable in fact is taking precedence, unless 
overridden on the commandline.
Comment 3 Toni Arnold 2006-10-25 11:47:42 UTC
The problem seems in fact to be known, but I didn't find a corresponding bug:
http://lists.kde.org/?l=kde-devel&m=111005445527489&w=2

I can confirm that the commandline --encoding=utf-8 takes precedence over the embedded variable, (which seems logical to me), but on my system the Encoding=ISO 8859-1 setting in the [Kate Document Defaults] section in katerc actually takes precedence over the embedded variable, which is counterintuitive for me.

Emacs handles it the oposite way. If called with

LANG=de_CH.latin-1 emacs uni.py

emacs describes itself like this:

Priority order for recognizing coding systems when reading files:
  1. iso-latin-1 (alias: iso-8859-1 latin-1)
  2. iso-2022-jp (alias: junet)

But the -*- coding: utf-8 -*- definition overrides this priority order. If that definition is missing, emacs opens the file as iso-latin-1 according to the priority order.

The problem bites me because I'm using kdevelop for different projects usually using latin-1 as encoding, but one of my own projects needs utf-8, and I sometimes forget to press F5 after each time I opened a file within the IDE. After saving my changes, the file is corrupted, and I manually have to merge the utf-8 parts again.
Comment 4 Anders Lund 2006-10-25 20:04:10 UTC
When the encoding selection in katepart was designed, we wanted to let the 
command line or file dialog setting take precedence over and embedded 
variable.

The embedded variable should be used if none of those are present.

The problem--AFAICS is that the file dialog *always* contains a setting, which 
should not be the case.
Comment 5 Thomas Friedrichsmeier 2007-12-13 21:30:00 UTC
*** Bug 142725 has been marked as a duplicate of this bug. ***
Comment 6 Thomas Friedrichsmeier 2007-12-13 21:31:42 UTC
Confirming in KDE 3.5.8 and KDE 4.
Comment 7 Anders Lund 2008-04-07 20:27:07 UTC
SVN commit 794503 by alund:

set provided encoding (from command line or file dialog) after loading meta data.
BUG: 160353
CCBUG: 135581
(does this fix 135581 as well?)


 M  +2 -0      katedocmanager.cpp  
 M  +1 -1      kateviewdocumentproxymodel.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=794503
Comment 8 Dominik Haumann 2009-06-04 23:33:34 UTC
duplicate of bug #187033, as that one is shorter and still valid for KDE4.

*** This bug has been marked as a duplicate of bug 187033 ***