Summary: | Invalid Unicode chars in file/foldernames appear to make file copies abort | ||
---|---|---|---|
Product: | [Frameworks and Libraries] frameworks-kio | Reporter: | bluescreenavenger |
Component: | general | Assignee: | David Faure <faure> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | a.samirh78, bugseforuns, filipfila.kde, idarktemplar, kde, kdelibs-bugs |
Priority: | NOR | ||
Version: | git master | ||
Target Milestone: | --- | ||
Platform: | Other | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Attachments: |
Archive containing the logs, and the tree, and the Seed.txt file to pass to the kiocopy script
Bash script of which to use to recreate the file tree that failed to copy. Pass the path to the Seed.txt to it. |
Description
bluescreenavenger
2018-12-18 13:13:51 UTC
Created attachment 116994 [details]
Bash script of which to use to recreate the file tree that failed to copy. Pass the path to the Seed.txt to it.
LANG definitely can make a difference to the way things run. Filesystems don't have a concept of UTF-8 or anything else just a concept of a raw bytestream. We then put that into a QUrl which obviously follows the locale it runs in. Though the worst this should do is (correctly) fail on a file conflict where one didn't previously exist. I think your script is creating a byte array that isn't valid text in any format. I can understand KIO failing to copy that because we don't treat things as a byte stream all the way through. However I /think/ KIO is generating an error correctly. The reason you're not getting the dialog in kioclient is a bug in kioclient fixed with D17652 and D17653. With those changes using Using your test with the seed here I do get an error. (In reply to Christoph Feck from comment https://bugs.kde.org/show_bug.cgi?id=162211#c128) > The filename limit is 255 bytes, not 255 characters. In UTF-8, any non-ASCII > character needs more than 1 byte. Additionally, the last character looks > cropped, causing an illegal UTF-8 name, which Qt does not handle. Not sure if answer should go here or in old bug, so I'll post here. While it's a posix-compliant limit of 255 bytes, ntfs filesystem has a different limit on max filename length: 255 characters, which may be more than 255 bytes if UTF-8 is used. And ntfs-3g may return such long filenames. I've already hit issue with such limit inconsistencies once: https://phabricator.kde.org/D8413 Seeing that the issue is in kiocleint5, I locally modified the script here, and made it bring up Dolphin, so I can copy the files. I DID indeed get a dialog this time saying it can't enter that folder... ...however it completely stopped the copy I started. The first attempt, only one item went through. copying other items without the corrupt name separately worked fine. It wasn't COMPLETELY silent, because I DID get an error dialog, but I would assume that it should have tried to just skip that one... I guess the only case where such a corrupt file name might be created for MOST users is a corrupt file system... ...maybe. Guess I didn't replicate the alleged silent failure after all? Invalid UTF-8 filenames mostly appear when non-aware tools (e.g. old archivers) create filenames with a different encoding, despite UTF-8 set in the locale. I am right now investigating if the encoding hack done by Róbert (see bug 165044 comment #142) can be ported to Qt 5. I assume https://phabricator.kde.org/D18161 is your attempt. Just to let you know, It works for me, and allows the test to pass. The paths with the corrupted names copy perfectly. @Christoph: is this bug fixed? (looking at the patch at phabrictor, I'd say yes, but I am not sure). According to comment #6, the legacy encoding hack fixed this issue. *** Bug 402697 has been marked as a duplicate of this bug. *** |