Bug 89866

Summary: SAVE FILE: indicate which chars can't be encoded
Product: [Applications] kate Reporter: Marcin Kasperski <Marcin.Kasperski>
Component: encodingAssignee: Christoph Cullmann <christoph>
Status: RESOLVED INTENTIONAL    
Severity: wishlist CC: jaba, marco
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Debian testing   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Marcin Kasperski 2004-09-20 12:17:53 UTC
Version:            (using KDE KDE 3.2.3)
Installed from:    Debian testing/unstable Packages

Background: I use Polish language enviroment, in particular my natural text file encoding is iso-8859-2.

Fairly frequent problem:

- I edit some text file in kate (in iso-8859-2 encoding)

- I copy&paste some text from OpenOffice document

- When I try to save in kate, I get the following error:
'Dokument nie mógł być zapisany ponieważ wybrane kodowanie nie może zachować każdego unikodowego znaku. Jeśli nie jesteś pewien jakiego kodowania użyć spróbuj UTF-8 lub UTF-16.'
(translating to english it meas: Document could not be saved because selected encoding could not keep all unicode characters. You can try using UTF-8 or UTF-16)

But what if I DO NOT want to convert this to unicode, but I really want it to be saved as iso-8859-2?

There are two or even three separate things which would be really useful in this situation:

a) An option to force the file save in my preferred encoding even if this means leaving some characters out or replacing them with - say - question mark.

b) Some way to find WHICH character (or charactes) caused the problem. I can imagine highlighting them or compile-style listing (whatever presentation method is chosen, probably there could be some button in the error window mentioned above which would activate this show)

c) After reviewing my document I strongly suspect that the whole problem is caused by some unicode-encoded space or punctuation (I really can not find any character which is not Ascii or Polish). Such characters really could be translated automatically in such context.
Comment 1 Marcin Kasperski 2004-09-20 12:26:36 UTC
Ad c) I found finally which character probably caused the problem in my last case: it was ,, (low double quotation mark). IIRC it should be present in iso-8859-2 but I am not sure, nevertheless I woudl really prefer having it converted to " or even (with some warning) to ? than an error forcing me to save in UTF-8 and convert externally.
Comment 2 Christoph Cullmann 2004-10-25 23:46:18 UTC
CVS commit by cullmann: 

change some save file checks
BUG: 89866


  M +17 -22    katedocument.cpp   1.766


--- kdelibs/kate/part/katedocument.cpp  #1.765:1.766
@@ -2690,6 +2690,7 @@ bool KateDocument::saveFile()
   // we really want to save this file ?
   //
-  bool reallySaveIt = !m_buffer->loadingBorked() || (KMessageBox::warningYesNo(widget(),
-      i18n("This file could not be loaded correctly due to lack of temporary disk space. Saving it could cause data loss.\n\nDo you really want to save it?")) == KMessageBox::Yes);
+  if (m_buffer->loadingBorked() && (KMessageBox::warningYesNo(widget(),
+      i18n("This file could not be loaded correctly due to lack of temporary disk space. Saving it could cause data loss.\n\nDo you really want to save it?")) != KMessageBox::Yes))
+    return false;
 
   if ( !url().isEmpty() )
@@ -2701,13 +2702,13 @@ bool KateDocument::saveFile()
       if (!isModified())
       {
-        if (!(KMessageBox::warningYesNo(0,
-               str + i18n("Do you really want to save this unmodified file? You could overwrite changed data in the file on disk.")) == KMessageBox::Yes))
-          reallySaveIt = false;
+        if (KMessageBox::warningYesNo(0,
+               str + i18n("Do you really want to save this unmodified file? You could overwrite changed data in the file on disk.")) != KMessageBox::Yes)
+          return false;
       }
       else
       {
-        if (!(KMessageBox::warningYesNo(0,
-               str + i18n("Do you really want to save this file? Both your open file and the file on disk were changed. There could be some data lost.")) == KMessageBox::Yes))
-          reallySaveIt = false;
+        if (KMessageBox::warningYesNo(0,
+               str + i18n("Do you really want to save this file? Both your open file and the file on disk were changed. There could be some data lost.")) != KMessageBox::Yes)
+          return false;
       }
     }
@@ -2717,13 +2718,10 @@ bool KateDocument::saveFile()
   // can we encode it if we want to save it ?
   //
-  bool canEncode = true;
-
-  if (reallySaveIt)
-    canEncode = m_buffer->canEncode ();
-
-  //
-  // start with worst case, we had no success
-  //
-  bool success = false;
+  if (!m_buffer->canEncode ()
+       && (KMessageBox::warningYesNo(0,
+           i18n("The selected encoding cannot encode every unicode character in this document. Do you really want to save it? There could be some data lost.")) != KMessageBox::Yes))
+  {
+    return false;
+  }
 
   // remove file from dirwatch
@@ -2733,6 +2731,5 @@ bool KateDocument::saveFile()
   // try to save
   //
-  if (reallySaveIt && canEncode)
-    success = m_buffer->saveFile (m_file);
+  bool success = m_buffer->saveFile (m_file);
 
   // update the md5 digest
@@ -2786,7 +2783,5 @@ bool KateDocument::saveFile()
   // display errors
   //
-  if (reallySaveIt && !canEncode)
-    KMessageBox::error (widget(), i18n ("The document could not be saved, as the selected encoding cannot encode every unicode character in it. If you are unsure of which encoding to use, try UTF-8 or UTF-16."));
-  else if (reallySaveIt && !success)
+  if (!success)
     KMessageBox::error (widget(), i18n ("The document could not be saved, as it was not possible to write to %1.\n\nCheck that you have write access to this file or that enough disk space is available.").arg(m_url.url()));
 


Comment 3 Marcin Kasperski 2004-10-28 15:53:14 UTC
I am not sure what exactly the patch above does (I suspect it allows me to force the save) but surely it does not one thing for which I'd like to keep this bug open: it does not show WHICH character/characters cause the save problem. 

So please leave it open for this problem: implementing some way of presentation to let the user know which file characters disallow the conversion (see also point b) in my initial report).
Comment 4 Richard Neill 2005-03-06 02:28:01 UTC
I agree. This problem is also caused by microsoft "smart quotes", usually when pasting stuff off the web. Personally, I'd like the option to force these into ASCII.

The package called "demoroniser" is good for this. (it's a short perl script)


Comment 5 Christoph Cullmann 2005-03-24 12:21:46 UTC
*** Bug 80305 has been marked as a duplicate of this bug. ***
Comment 6 Thomas Friedrichsmeier 2007-12-07 15:24:42 UTC
*** Bug 104534 has been marked as a duplicate of this bug. ***
Comment 7 Christoph Cullmann 2015-10-08 09:02:35 UTC
Dear user,

this wish list item is now closed, as it wasn't touched in the last two years and no contributor stepped up to implement it.

The Kate/KTextEditor team is very small and we can just try to keep up with fixing bugs. Therefore wishs that show no activity for two years or more will be closed from now on to keep at least a bit overview about 'current' wishs of the users.

If you want your feature to be implemented, please step up to provide some patch for it. If you think it is really needed, you can reopen your request, but keep in mind, if no new good arguments are made and no people get attracted to help out to implement it, it will expire in two years again.

We have a nice website kate-editor.org that provides all the information needed to contribute, please make use of it. For highlighting improvements our user manual shows how to write syntax definition files.

Greetings
Christoph