Bug 260855 - When creating a zip file accents are not encoded properly (i.e. "é" "è" etc.)
Summary: When creating a zip file accents are not encoded properly (i.e. "é" "è" etc.)
Status: RESOLVED DOWNSTREAM
Alias: None
Product: ark
Classification: Applications
Component: general (show other bugs)
Version: 2.15
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: Raphael Kubo da Costa
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-21 10:06 UTC by Mahendra Tallur
Modified: 2010-12-21 13:06 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mahendra Tallur 2010-12-21 10:06:53 UTC
Version:           2.15 (using KDE 4.5.4) 
OS:                Linux

When creating a zip file, accents are not encoded properly (i.e. "é" "è" etc.) and are replaced by 2 question marks. It occurs when creating the archive from both the Dolphin service menu and Ark itself.

Reproducible: Always

Steps to Reproduce:
1) create a file or a drawer whose filename contains special characters with accents, such as "é" "è" "ê" etc (French accents in this case).
2) with Dolphin, right click on it and create a zip file (or do it through Ark)
3) open the zip file with ark or from the command line

Actual Results:  
Each special character (or accent) is replaced by two question marks "??"


Expected Results:  
The filenames should be the same as the source files / drawers

Please note :
1) this occurs when creating zip files, but not when creating rar files or tar.gz files
2) the fact that a single char is replaced by 2 question marks seems to indicate that the filename was encoded in UTF-8 but that the zip file was not specified as being in UTF-8
Comment 1 Mahendra Tallur 2010-12-21 10:47:22 UTC
An Archlinux user just told me he couldn't reproduce this issue (KDE 4.5.x as well). Now I wonder if it's a Kubuntu bug or if the bug is partly due to Kubuntu and partly to KDE...
Comment 2 Mahendra Tallur 2010-12-21 10:53:03 UTC
One more piece of information :

when listing the file with unzip, I get question marks. When listing the file within ark, I get question marks. BUT, when *extracting* the archive I get in the command line, for instance :

"extracting : No??l" 

BUT the extracted file has the correct filename !
Comment 3 Mahendra Tallur 2010-12-21 11:16:10 UTC
It turned out this bugreport should be closed, I'm really confused.

Actually, the very same behaviour occurs when using "zip" from the command line ! As there's no such issue in Ubuntu, I thought there wouldn't either in Kubuntu, so I didn't even check this at first, sigh...

KDE is not at fault here, really sorry for the spam.
Comment 4 Raphael Kubo da Costa 2010-12-21 12:28:22 UTC
Just one thing: are you not using UTF-8 when creating those files with accents and the zip file afterwards?
Comment 5 Mahendra Tallur 2010-12-21 12:46:25 UTC
Well, I assume I should be using UTF-8 all the time. The fact one single accent is replaced by two "??" shows the filenames were encoded as UTF-8 in the archive.

Anyway, the problem is not in KDE. But, I figured out the following which is really strange (and will open a new bug report accordingly... in Launchpad).

By default, in Ubuntu 10.10 or 11.04, when creating a zip archive containing files that contains special characters, with both file-roller and zip from the command line : I get "??" instead of accents.

HOWEVER, after installing p7zip-full, filenames are encoded properly. Even when just invoking zip from the command line. Which seems to indicate that zip calls p7zip-full when the latter is present.

But it's not that simple... Because this behaviour doesn't apply under... Kubuntu ! I installed p7zip-full and I still get "??" when calling zip...

(one more remark : people experience this issue under Debian & Ubuntu ; not under OpenSUSE and ArchLinux)
Comment 6 Raphael Kubo da Costa 2010-12-21 13:06:58 UTC
ArchLinux seems to build its unzip package much like Debian (and Ubuntu, consequently), however if I create a file with accents in its name it is shown correctly here with 'unzip -l'.

Bug 240727 is related, but it mentions zip files created in non-UTF-8 encodings.

7zip is known to handle these situations better.

I'll close this as a bug downstream (in this case, Ubuntu/Debian). In case there's some thing wrong on our side, please don't hesitate to reopen. Thanks.