Bug 161588

Summary: Flash disk mounted with wrong codepage and/or charset
Product: [Unmaintained] kdelibs Reporter: Azamat S. Kalimoulline <turtle>
Component: generalAssignee: Kevin Ottens <ervin>
Status: RESOLVED DUPLICATE    
Severity: normal CC: andrea.franceschini, chalkerx, dsent.zen, ervin, grundleborg, kevin.kofler, mikid, shafff, shrek2099, victorjss
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Ubuntu   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: patch from comment #2
Patch which deals with every HAL mount option, not only 'iocharset='
Patch which deals with every HAL mount option, not only 'iocharset='

Description Azamat S. Kalimoulline 2008-05-04 10:32:13 UTC
Version:            (using KDE 4.0.3)
Installed from:    Ubuntu Packages
OS:                Linux

Instead national characters (my system use ru_RU.UTF-8 locale and flash formatted on vfat, codepage 866) i see '?' sign.
Comment 1 George Goldberg 2008-06-08 11:31:07 UTC
Where do you see the '?' characters. Which application shows them, and where in the application? Are there any places where the correct characters are shown?
Comment 2 Fredrik Johansson 2008-06-08 15:42:42 UTC
I am also seeing this behaviour, as I live in sweden I am using a LANG=sv_SE.UTF-8 locale.

Here is how shows up for me.


I use a USB flash memory stick to move files between home and work, at my work there is only windows XP (with a swedish locale also). 

The memorystick is formatted as FAT32 (like most flash memorysticks), using chcp in win xp command prompt reveals that windows mounts the drive as:
codepage 850
charset utf-8

When I get home again and stick my flashdrive in my kubuntu 8.04 KDE4 machine, It mounts but without the much needed iocharset=utf8 mount option. 

So every file or folder with a localized name displays ? in and can't be opened.

For example:
  Pupil_jobs/
     SörenAndersson.doc
     ÖstenÖrkenrud.doc
     ...

becomes:
  Pupil_jobs/
     S?renAndersson.doc
     ?sten?rkenrud.doc
     ...

You cant open those files in any tool or application

However if I umount and use pmount or mount them manually with iocharset=utf8 option, I can browse my files without a problem.

Searching the web for a solution, I only found a deprecated way to use merge keys in HAL .fdi files (volume.policy.mount_options.iocharset), but have never been supported by KDE (according to several archive mailing lists)

The other solution recomended for kde (by people on some forums) was to edit /etc/fstab, thus eliminating the use of HAL.

HAL developers says that application writers knows better about which iocharset the user wants and therefore has deprecated volume.policy.* and it looks like KDE developers trust HAL to know best.
At least from what I can make of it by looking at the code.

This issue seems to have been debated for a long time
http://bugs.kde.org/show_bug.cgi?id=133456

So currently the automount feature is more or less crippled in kde for everybody not using english locale.

I really which this bug could be solved...

In the meantime I wrote up a quick hack for solid/backend/hal/halstorageaccess.cpp that works for me, it just uses utf8 if its the system default charset.

Index: solid/solid/backends/hal/halstorageaccess.cpp
===================================================================
--- solid/solid/backends/hal/halstorageaccess.cpp       (revision 818361)
+++ solid/solid/backends/hal/halstorageaccess.cpp       (working copy)
@@ -25,6 +25,7 @@
 #include <QtDBus/QDBusReply>
 #include <QtGui/QApplication>
 #include <QtGui/QWidget>
+#include <QtCore/QTextCodec>

 #include <unistd.h>

@@ -230,6 +231,14 @@
     QStringList options;
     QStringList halOptions = m_device->property("volume.mount.valid_options").toStringList();

+    if ( halOptions.contains("iocharset=") ) {
+       QTextCodec *codec = QTextCodec::codecForLocale();
+       QString charset = QString(codec->name()).toLower();
+       if(charset == "utf-8"){ // for some reason, mount didnt like UTF-8, convert to utf8
+           options << "iocharset=utf8";
+       }
+    }
+
     if (halOptions.contains("uid=")) {
         options << "uid="+QString::number(::getuid());
     }

Regards
Fredrik Johansson

Comment 3 Azamat S. Kalimoulline 2008-06-09 06:36:32 UTC
I see '?' characters in Dolphi, in Konqueror when browsing files on flash. Even see it by `ls /media/flash`. I've done `cat /etc/mtab' to see with wich options device mounted. And there missing standart 'codepage=866,utf8' options. When I mount it manually - all works fine. KDE3 mount devices correctly (also without 'codepage=866', but with 'utf8' options and it works fine too).
Comment 4 Christophe Marin 2008-06-13 09:04:03 UTC
*** Bug 163937 has been marked as a duplicate of this bug. ***
Comment 5 George Goldberg 2008-06-30 03:59:11 UTC
Reassigning to solid in the hope that the right person will see it there.
Comment 6 Danila Sentiabov 2008-07-30 20:11:46 UTC
I can confirm this in KDE 4.1 / openSUSE 10.3. All my external drives - flash and HDD - mounted incorrectly by KDE 4 making them almost unusable.
It was bad enough in KDE 3.5, because some useful settings could not be set via "mount properties" dialog, but at least some basic settings as "UTF-8 charset", "mount directory" and ability to disable that awful "lowercase names" thing were available. Now there is no mount settings at all and all files with Russian names look like "????? ??? ?????".
This is the only major bug I found in KDE 4 at the time, yet alone it's serious enough to fallback to 3.5. Mounting and unmounting each external drive that I need to use manually is not very pleasant. Is there no workarounds that not involve manual mounting? Maybe there is at least some settings in some configuration file that could be set as defaults for all drives?
Sorry for my English, I tried my best )
Comment 7 bgn66922 2008-07-31 16:51:35 UTC
The bug still exists on KDE 4.1. KDE4 mounts vfat drives without utf8 option,
so any non-English characters in file names become "???", and no place to tweak for it like in KDE 3.5
Comment 8 bgn66922 2008-08-01 02:56:43 UTC
*** This bug has been confirmed by popular vote. ***
Comment 9 kujub 2008-08-02 11:48:23 UTC
Created attachment 26565 [details]
patch from comment #2

Tried the patch at ArchLinux KDE 4.1.0.
Seems to work good.
Thank you Fredrik !
Comment 10 Rubens de Souza Matos Júnior 2008-08-04 21:17:20 UTC
I have the same problem, using Debian Lenny with charset pt_BR.UTF-8 . 
Comment 11 Valentine Sinitsyn 2008-08-06 16:59:21 UTC
Here are my two cents.

The patch above doesn't account locales rather than UTF-8 (which are rare, to be honest). Anyway, the following is another small patch (in works-for-me stage) taking care of all the options specified in HAL config files (it can be seen as a rewrite of the similar patch for KDE 3.x - bug 133456)
Comment 12 Valentine Sinitsyn 2008-08-06 17:03:30 UTC
Created attachment 26699 [details]
Patch which deals with every HAL mount option, not only 'iocharset='
Comment 13 Valentine Sinitsyn 2008-08-06 17:39:33 UTC
Created attachment 26701 [details]
Patch which deals with every HAL mount option, not only 'iocharset='
Comment 14 Valentine Sinitsyn 2008-08-06 17:40:40 UTC
Comment on attachment 26701 [details]
Patch which deals with every HAL mount option, not only 'iocharset='

Attachment 26699 [details] contained a small typo which now has gone.
Comment 15 Shrek Big 2008-08-07 04:58:05 UTC
Honestly speaking, this "bug" is not a REAL bug at all in the eyes of developers or linux expertise because they can always manually do everything. However, since KDE is a desktop environment targeting at regular desktop users, letting average users do manual work each time when they plug in external storage devices may contradict the original effort of KDE, isn't it? So it IS a bug from regular desktop users' perspective.

I was unware of this "bug" because i use Gnome most of the time and wanted to give new KDE a try. When I plugged in my external hard drive to backups as I usually did in Gnome, the unfortunate "lower case" default and unchangable options in KDE led to the deletion of most of my backed data in the external hard drive, hundreds of giga bytes! What a nightmare! Of course, experts can always say that you can manually do the mount job, which is not the purpose of this post.
Comment 16 Nikita Skovoroda 2008-08-18 09:23:15 UTC
Any news on this bug?

I can confirm this on OpenSUSE and KDE 4.0, 4.1.0, 4.1.61 (in OpenSUSE KDE4:Unstable packages) on 4 different machines. It looks like this bug affects all non-english users.

Could that patch be included in 4.1.1 / 4.2 ?
Comment 17 Victor Suarez 2008-08-23 19:46:38 UTC
I'm Spanish, and this bug-problem-whatever forced me to install kde3 dolphin because, at least, I can change the mount options by right clicking on the device. A week ago I supposed that the default configuration would let me carry a Java project in my USB memory stick, but the project was copied with all filenames in lower case (WEB-INF folders, META-INF folders, class names, etc.). I lost a whole working day :(

Of course I can resolve this by using the command line, but when I'm working, I'm looking for productivity and this is the opposite to productivity.

This bug and bug #165044 (similar problems browsing Samba shares) makes KDE 4.1 very difficult to adopt, but I'm trying.

Regards
Victor
Comment 18 kujub 2008-09-27 20:09:21 UTC
Installed KDE 4.1.2 today. (ArchLinux-Package without any patch related to this bug.) The bug seems to be fixed.

USB-Memory is mounted like this:
/dev/sdf1 on /media/SD KURT 1 type vfat (rw,nosuid,nodev,uid=1000,codepage=437,iocharset=utf8)
and at least german umlauts work out of the box now.
Comment 19 Kevin Kofler 2008-09-28 12:16:32 UTC

*** This bug has been marked as a duplicate of bug 161673 ***
Comment 20 Nick Shaforostoff 2008-09-28 14:41:08 UTC
> UTF-8 (which are rare, to be honest).
a. how do you know?
b. even if this is so, I'm glad KDE will support transition to UTF-8.
Comment 21 Kevin Kofler 2008-09-28 14:43:54 UTC
Hardcoded iocharset=utf8 means this assumes a UTF-8 locale.
Comment 22 Danila Sentiabov 2008-09-28 17:41:11 UTC
> Hardcoded iocharset=utf8 means this assumes a UTF-8 locale. 
Most of recent distributions (since about 2006, I think) have UTF-8 locale by default.
Comment 23 Kevin Kofler 2008-09-28 18:34:05 UTC
Hey, I'm not arguing that point. Fedora has used UTF-8 by default for even longer than that. It was still called Red Hat Linux at that time! People really ought to use UTF-8 locales by now.
Comment 24 Szczepan Hołyszewski 2009-02-14 11:06:57 UTC
Devs, for chrissake, stop FORCING UTF8 on people!!!!!!!!!!!!!!!! UTF8 sucks, it is NOT viable and it will take decades before ALL programmers understand that There Is No Plain Text. Meanwhile everyone in the world (except those who only use the diacritics-challenged language for which the original ASCII code was designed) WILL see question marks and gibberish in 30% of applications. C'mon, EVEN THE VERY DOLPHIN CAN'T DISPLAY ITEMS ON THE PLACES PANE CORRECTLY IN LANGUAGES THAT USE NON-ASCII CHARACTERS! This is utter embarassment! Either do UTF8 right, or don't do it at all, and DON'T FORCE IT ON PEOPLE! ARRRGH!!!!!

Folks here report character encoding problems when mounting removable drives while using UTF8 locale. Well, I am using iso-8859-2 locale and I have a different character encoding problem: I don't see question marks but gibberish characters, two for each national character in a filename - a dead giveaway of a borked attempt at using UTF8 as internal representation somewhere along the way. It turns out that UTF8 support is not only broken but it also BREAKS THINGS FOR PEOPLE WHO DON'T USE IT. That's just totally, bitterly, caustically ridiculous.
Comment 25 Kevin Kofler 2009-02-14 18:10:59 UTC
The whole point of this sort of changes is to "do UTF-8 right". You should be using a UTF-8 locale these days.

That said, it could probably be fixed to support other locales as well.
Comment 26 Szczepan Hołyszewski 2009-02-15 19:10:51 UTC



> The whole point of this sort of changes is to "do UTF-8 right".

In my scenario described above, the only way to do UTF-8 right is NOT to do it at all, UTF-8 is NOT PRESENT: filenames are encoded in iso-8859-2, and the locale is iso-8859-2. UTF-8 is an intruder here.

> You should be using a UTF-8 locale these days.

Or else?

NO WAY until it solves significantly more problems than it brings, and bring problems it does big time, sometimes even for those who don't use it.
Comment 27 Andrea Franceschini 2009-04-30 02:12:45 UTC
Sorry guys but I'm still facing this bug under kde-4.2.2 and it's quite blocking me, since I'm using my primary machine and I use it for work.

I tried pretty much everything I found on Google with keywords "kde4 automount utf8" (like compiling codepages in the kernel, making fdi files for HAL etc...) but nothing solved the problem -- except manual mount, of course, but it's not feasible at all for me.

$locale says I'm using it_IT.utf8@euro, which is right, locales are generated for glibc and everything about localization seems to be perfectly set. Is there any news about this bug?
Comment 28 Nick Shaforostoff 2009-04-30 16:42:01 UTC
can you try with just it_IT.utf8? (w/o @euro)
Comment 29 Kevin Kofler 2009-04-30 16:44:42 UTC
The correct locale is actually it_IT.UTF-8 (not utf8). And @euro is redundant, the UTF-8 locales always use the Euro. @euro was just used to distinguish ISO-8859-15 (@euro) from ISO-8859-1 (the old locales) back in the day.
Comment 30 Andrea Franceschini 2009-04-30 16:55:24 UTC
My mistake, sorry. It now works, thanks.