I have a zip file, that has been packed on Windows and is password protected. Unfortunately, the password has some local characters from cp1250 charset - like ž, ř, ů. When I try to unpack the zip, Ark asks me for password correctly, but there is no way I can enter the non-UTF8 characters into the field. Can you add an encoding selector for the password?
(In reply to Jakub Holý from comment #0) > I have a zip file, that has been packed on Windows and is password > protected. Unfortunately, the password has some local characters from cp1250 > charset - like ž, ř, ů. > > When I try to unpack the zip, Ark asks me for password correctly, but there > is no way I can enter the non-UTF8 characters into the field. How so? Are you saying that you cannot even _type_ those characters into the password dialog? I just tried and I was able to paste the ů character just fine.
> How so? Are you saying that you cannot even _type_ those characters into the > password dialog? I just tried and I was able to paste the ů character just > fine. I cannot type the localized ů - it just enters the utf8 ů (0x016F), not the 0xF9 character.
Are you able to extract the file using 7z or zip from the command line?
yes, but its a little complicated. I have to `cat` the password in a file, `recode utf8..cp1250 pass.txt` and then use `unzip -P pass.txt zipfile.zip` It's quite difficult, and impossible for my mum :-)
I was unable to reproduce this using the latest version of Ark. Can you retest this using the latest versions?
Created attachment 118030 [details] test files to reproduce I have created an example :-) In the general zip (so I would not have to upload 4 files) are a few files. hello.txt - file i have zipped/packed pass_utf8.txt - contains password in utf8, which I am able to enter quite easily on any czech keyboard or just by copy/paste, following file was encrypted with it hello_utf8.zip - zipped file with password "žena" (woman in czech) You can try this and you shoul'd be able to extract the zip just fine. pass_cp1250.txt - also "žena" password, but converted to cp1250 hello_cp1250.zip - encrypted with the password in cp1250 encoding If you try to open this, there is no way you can write the cp1250 password into the password prompt. (examine both the passwords in hex editor, they are really different - utf8 pass has 1 byte more) So what I would like to have: where: password prompt dialogue window what: something like the encoding selectbox here https://i.imgur.com/1cwfiCr.png
I finally understood the scenario. Thanks for the provided zip-file. This indeed looks like an issue. It's kinda weird that not even Kate realizes the encoding (cp1250) when opening the file initially.
There is no way for Kate to know, as there is no BOM. It might be cp1250 as well as cp1252 or cp1251 or any other single byte encoding.
Bug reporter, would you please try The Unarchiver (unar)? I can unzip the file by `unar -E windows-1250 hello_cp1250.zip`. Just type the password in UTF-8 and I believe that `unar` will convert it to the specified encoding.
Yes, `unar` works, as well as previously mentioned `recode`+`unzip`. But this is still not GUI :-( Thanks anyway
libzip developer says that libzip just takes the bytes as given and does not check any encoding, and makes no assumptions about the encoding of the password. [1] Hope this helps for Ark developers. [1] https://github.com/nih-at/libzip/issues/207#issuecomment-683630284
I played a bit with the test file (thanks for that!). If we convert the UTF-16 password provided by the ark password dialog to the windows-1250 encoding using QTextCodec, the archive is extracted just fine (since libzip just uses the raw bytes and doesn't care about the encoding). The problem is: how do we ask the user which encoding wants to use for the password? We can't really do it in the password dialog, because ark uses the general-purpose KPasswordDialog provided by kwidgetsaddons. The easiest thing could be to add a dropdown menu in the ark settings to configure which encoding to use for passwords. There is the additional problem that QTextCodec is gone in Qt6 and the replacement doesn't yet have feature parity: https://phabricator.kde.org/T14154 But since we need QTextCodec for Kate I don't think would be too bad if we keep using it in Ark too.
Instead of asking the user to provide the encoding, would it be an option to simply attempt decrypting with several common encodings? Yes, there are probably a few zip files out there with very exotic encodings, but simply trying several ones would cover the vast majority. Unlike when displaying text, the program can know that the right encoding was found when decryption succeeds. Let's be real, most people have no idea that text encodings even exists, what they are, let alone which one to choose. Not asking is the better UX. It might slow down showing the failure message if a wrong password is entered, but hopefully not to a relevant degree.