Bug 245334 - Kate does not allow to change auto-detected encoding.
Summary: Kate does not allow to change auto-detected encoding.
Alias: None
Product: kate
Classification: Applications
Component: encoding (show other bugs)
Version: SVN
Platform: Archlinux Linux
: NOR normal with 35 votes (vote)
Target Milestone: ---
Assignee: KWrite Developers
Depends on:
Reported: 2010-07-21 17:46 UTC by Nikita Skovoroda
Modified: 2012-11-01 15:16 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:

file to reproduce the bug (65 bytes, text/plain)
2010-08-07 20:16 UTC, Nikita Skovoroda

Note You need to log in before you can comment on or make changes to this bug.
Description Nikita Skovoroda 2010-07-21 17:46:50 UTC
Version:           SVN (using Devel) 
OS:                Linux

Sometimes I want to change auto-detected encoding.
For example: I want to owerrite the file and save it in different encoding.
Or there are two encodings in the file, and I want the second part of it.
The encoding menu does not do anyting if autodetect is turned on in settings.

Reproducible: Always

Steps to Reproduce:
Set universal encoding autodetection in kate (default).
Open a not empty file.
Select another encoding from menu.

Actual Results:  
Nothing happens.

Expected Results:  
View the file in selected encoding.

If the file takes some time to load, you could see it in desired encoding for a moment, but it switches back instantly.
Comment 1 Nikita Skovoroda 2010-07-22 11:37:21 UTC
$ kate --version
Qt: 4.6.3
KDE: 4.4.92 (KDE 4.4.92 (KDE 4.5 RC2))
Kate: 3.4.92
Comment 2 Nikita Skovoroda 2010-07-26 11:49:32 UTC
Ok. On a mixed (cp1251 + utf8) file, kate forces cp1251.

Default is utf8.
Auto-detected us cp1251.

Setting anything but utf8 works, setting utf8 sets cp1251.
Comment 3 Nikita Skovoroda 2010-08-07 20:16:03 UTC
Created attachment 49904 [details]
file to reproduce the bug

Can not open this file with utf-8 encoding with auto-detect enabled.
Comment 4 Nikita Skovoroda 2010-08-07 20:51:43 UTC
$ kate --version
Qt: 4.6.3
KDE: 4.4.95 (KDE 4.4.95 (KDE 4.5 >= 20100723))
Kate: 3.4.95

Comment 5 Christoph Cullmann 2010-08-07 21:55:06 UTC
SVN commit 1160327 by cullmann:

    debug helper output, show me if a loading try with an encoding failed
    the bug 245334 is invalid
    if you try to open a file which is not utf-8 == contains invalid characters, the encoding detection will kick in
    this is intended behaviour, else you can loose file content
    while i can understand, that people might want to enforce an encoding even if it is wrong, if they have multiple encodings in one file, which is btw. no

CCBUG: 245334

 M  +3 -1      katetextbuffer.cpp  

WebSVN link: http://websvn.kde.org/?view=rev&revision=1160327
Comment 6 Nikita Skovoroda 2010-08-08 07:54:00 UTC
Ok. I can give you at least two use-cases.

1) It is a broken web page with mixed encodings. This happens sometimes, and when i need to copy (or find) something from utf-8 part of the page, i can't due to this behaviour.

2) The file is in cp1251. I want to save it with utf-8. The file is remote, but kate thanks to kio opens them perfectly, no need to copy the file to the home folder.
What can i do: open file, copy the text (ctrl-a, ctrl-c), select utf-8 encoding, select all text (ctrl-a), paste correct text (ctrl-v), save file.
With auto-detect not allowing the third step, i have to clear the file first, which requires more actions.

Anyway, this is confusing. If user selects utf-8, he wants to view this file in utf-8 (or save in utf-8 overriding all text), but the auto-detections thinks that it knows better what user wants than himself.

If I click on, say, cp1252 (or koi8-r), it applies cp1252 (or koi8-r). If i click on utf-8 (or ucs-2), it does not apply utf-8 (or ucs-2).

I understand, that one-byte encoding and multi-byte encodings are different, but it works somehow without autodetection.
This is a very confusing behaviour from a user point of view.
Comment 7 Christoph Cullmann 2010-08-08 10:59:34 UTC
That it is confusing is perhaps correct, but it is the only way to protect people from killing their data. Setting the encoding from the menu is no difference for kate than trying to use the normal encoding set as standard. If it dosn't work, it will always fallback to auto-detection and if that doesn't work, try the fall-back encoding given in config.

I could change the behaviour of the "Encoding" menu, to not use this heuristic but really enforce the usage of the given encoding.
But that is a feature and no minimal change.
Comment 8 Nikita Skovoroda 2010-08-08 22:10:23 UTC
What if user wants to kill that data, as that part is not needed, but he wants to view the other one? I posted two use-cases here.

And, that was not the only way.

The old way is MUCH better: show user a warning (with a "no more" checkbox), and, maybe, block file saving (but allow to unblock it). That was a little annoyng, but at least reasonable.

Also, no text editor at all would more certainly not allow user to kill their data with it.

My opinion is that the list should look like this:
1) User-selected encoding - overrides everything. User might recive a warning (like in 4.4) if there are problems with multi-byte encodings.
2) Auto-detected encoding - overrides default. Auto-detection is optional, but turned on by default.
3) Default encoding - the system one.

When the text editor (used partially by developers) thinks that it is smarter than the user, it's just not right. Maximum that it should give is a warning. Like it was before, in KDE 4.4.
Comment 9 Timur Sultanov 2010-08-11 14:44:26 UTC
Also, it affects the KWrite.

> kwrite --version
Qt: 4.6.3
KDE: 4.5.00 (KDE 4.5.0)
KWrite: 4.5.00 (KDE 4.5.0)
Comment 10 Alexander Kandaurov 2011-01-14 12:24:34 UTC
Maybe this is made to prevent the users from killing their data, but the way how it is done is really awful. I guess any typical user who sees a program not setting just selected encoding would consider such behaviour as buggy; the encoding selection menu will seem to be not working. If such unexpected behaviour is intentional, then the program must give, for example, a message explaining the reason why the user selected encoding was not set.
Comment 11 Alexander Kandaurov 2011-01-14 12:24:59 UTC
*** This bug has been confirmed by popular vote. ***
Comment 12 Christoph Cullmann 2012-11-01 14:56:59 UTC
Git commit f2bcfd4fae3120d55c45797eea005a5c147a152b by Christoph Cullmann.
Committed on 01/11/2012 at 15:55.
Pushed by cullmann into branch 'master'.

allow the user to hard set encoding via encoding menu
if the user chooses an encoding there, the triggered reload of the document will use exactly that one
if an error occurs (e.g. file not in that encoding, warning will be shown in editing area but file will be loaded as good as possible with this encoding)

M  +3    -3    part/buffer/katetextbuffer.cpp
M  +2    -1    part/buffer/katetextbuffer.h
M  +2    -2    part/document/katebuffer.cpp
M  +2    -1    part/document/katebuffer.h
M  +8    -2    part/document/katedocument.cpp
M  +9    -0    part/document/katedocument.h
M  +1    -1    part/tests/encoding/kateencodingtest.cpp
M  +2    -2    part/view/kateviewhelpers.cpp

Comment 13 Nikita Skovoroda 2012-11-01 15:12:12 UTC
Thanks! =)
Comment 14 Christoph Cullmann 2012-11-01 15:16:35 UTC
Sorry that this has taken that long, just had to much other stuff to do :) Thanks for your concrete use-case description.